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SEED PLANTS CHARACTERIZED BY DELAYED SEED DISPERSAL 



This invention was made with government support 
under DCB901874 9 awarded by the National Science 
Foundation.. The government has certain rights in the 
5 invention. 



BACKGROUND OF THE INVENTION 



FIELD OF THE INVENTION 

The present invention relates generally to 
plant molecular biology and genetic eingineering and. more 
10 specifically to the production of genetically modified 

seed plants in which the natural process of dehiscence is 
delayed. 

BACKGROUND INFORMATION 

Rapeseed is one of the most important oilseed 
15 crops after soybeans and cottonseed, representing 10% of 
the world oilseed production in 1990. Rapeseed 
contains 40% oil, which is pressed from the seed, leaving 
a high-protein seed meal of value for animal feed and 
nitrogen fertilizer. Rapeseed. oil, also known as canola 
20 oil, is a valuable product,, representing the fourth most 
commonly traded ve.getable oil in the world. 

The production of oilseeds, meal and oil from, 
rapeseed plants has been increasing continuously for the 
last 30 years for food and feed grains, mainly by 
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expansion of the .area under cultivation. Most northern 
European countries produce rapeseed as their main edible 
oil crop. By. the year 2000, China is expected to be the 
leading producer with 9.2 metric tons {Mt;.26%)7 followed 
5 by India with 7.8 Mt (22%); the European Community (12 
countries), with 7.6 Mt (21%); Canada, 3.8 Mt (11%) and 
eastern Europe with 2.6 Mt (7%). 

Unfortunately, the yield of seed from rapeseed 
and related plants is limited by pod dehiscence, which is 
10 a process that pccurs late in fruit development whereby 
the pod is opened and the enclosed seeds released. 

i 

Degradation and separation of cell walls along a discrete 
layer of cells dividing the two halves of the pod, termed 
the "dehiscence zone, " result in separation of the two 

15 halves of the pod and release of the contained seeds. 
Seed "shattering," whereby seeds are prematurely shed 
through dehiscence before the crop can be harvested, is a 
significant problem faced by commercial seed producers 
. and represents a loss of income to the industry. Adverse 

20 weather conditions can exacerbate the process of 

dehiscence, resulting in greater than 50% loss of seed 
yield. 

Attempts to solve this problem over the past 20 
years have focused on the breeding of shatter-resistant 

25 varieties. However, these plant hybrids are frequently 
sterile and lose favorable characteristics that must be 
regained by backcrossing, which is both time-consuming 
and laborious- Other strategies to alleviate pod 
shattering include the use of chemicals such as pod 

30 sealants or mechanical techniques such as swathing to 
reduce wind-stimulated shattering. To date, however, a 
simple method for producing genetically modified seed 
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plants that do not open and release their seeds 
prematurely has not been described. 

Thus, a need exists for identifying .genes that 
regulate the dehiscence process and for developing 
5 genetically modified seed plant varieties in which the 
natural seed dispersal process is delayed. The present 
invention satisfies this need and provides related 
advantages as well. 

SUMMARY OF THE INVENTION 

10 The present invention provides a non-naturally 

occurring seed plant that is characterized by delayed 
seed dispersal due to ectopic expression of a nucleic 
acid molecule encoding an AGL8-like gene product. The 
AGL8-like gene product can have, for example, 

15 substantially the amino acid sequence of an AGL8 orthplog 
such as Arabidopsis AGL8 .{SEQ ID N0:2) . Particularly 
useful seed plants of the invention, which are 
characterized by delayed seed dispersal, include members 
of the Brassicaceae, such as rapeseed, and members of the 

20 Fahaceae, such as soybeans, peas, lentils and beans. 

In one embodiment, the invention provides a 
transgenic seed plant that is characterized by delayed 
seed dispersal due to ectopic expression of a nucleic 
acid molecule encoding an AGL8~like gene product. In a 

25 transgenic seed plant of the invention, the nucleic acid 
molecule encoding the AGLS-like gene product can be 
operatively linked to an exogenous regulatory element. 
Useful exogenous regulatory elements include constitutive 
regulatory elements and dehiscence zone-selective 

30 regulatory elements. In particular, the exogenous 

regulatory element can be a dehiscence zone-selective. 



wo 99/00502 PCT/US98/13208 



regulatory element that is an AGLl regulatory element or 
' an AGL5 regulatory element. 

In another embodiment, the invention provides a 
non-naturally occurring seed plant that is characterized 
by delayed seed dispersal due to suppression, of both AGLl 
and AGL5 expression in the seed plant. Such a 
non-naturally occurring seed plant characterized by 
delayed seed dispersal can be, for example, an agi I agl5 
double mutant. 



10 The present invention further provides a tissue 

derived from a non-naturally occurring seed plant of the 
invention. In one embodiment, the invention provides a 
tissue derived from a non-naturally occurring seed plant 
that has an ectopically expressed nucleic acid molecule 

15 encoding an AGL8-like gene product and is characterized 
by delayed seed dispersal. In another embodiment, the 
invention provides a tissue derived from a non-naturally 
occurring seed plant in which AGLl expression and AGL5 
expression each are suppressed, where the seed plant is 

20 characterized by delayed seed dispersal. 

Methods of producing a non-naturally occurring 
seed plant characterized by delayed seed dispersal also 
are provided herein.. Such. methods entail ectopically 
expressing a nucleic acid molecule encoding an AGL8-like 
25 gene product in the seed plant, whereby seed dispersal is 
delayed due to ectopic expression of the nucleic acid 
molecule. 



The invention also provides a substantially 
purified dehiscence zone-selective regulatory element, 
30 comprising a nucleotide sequence that confers selective 
expression upon an operatively linked nucleic acid 
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molecule in the valve margin or dehiscence zone of a seed 
plant, provided that the dehiscence zone-selective 
regulatory element does not have a nucleotide sequence 
consisting of nucleotides 1889 to 2703 of SEQ ID NO: 4. 
5 The dehiscence zone-selective regulatory element can be, 

- • • 

for example, an AGLl regulatory element or AGL5 
regulatory element. 

Further provided is a plant expression vector 
containing a dehiscence zone-selective regulatory element 
10 that confers selective expression upon an operatively 
linked nucleic acid molecule in the valve margin or 
dehiscence, zone of a seed plant, provided that the 
dehiscence zone-selective regulatory element does not 
have a nucleotide sequence consisting of nucleotides 1889 

15 to 2703 of SEQ ID NO: 4. If desired, a plant expression 

»• _ 

vector can contain a nucleic acid molecule encoding an 
AGL8-like gene product in addition to the dehiscence 
zone-selective regulatory element. 

The invention also provides a kit for producing 
20 a transgenic seed plant characterized by delayed seed 
dispersal, such kit containing a dehiscence 
zone-selective regulatory element that confers selective 
expression upon an operatively linked nucleic acid 
molecule in the valve margin or dehiscence zone of a seed 
25 plant, provided that said dehiscence zone-selective 

regulatory element does not have a nucleotide sequence 
consisting of nucleotides 1889 to 2703 of SEQ ID NO: 4. 
In a kit of the invention, the dehiscence zone-selective: 
regulatory element can be, if desired, operatively linked 
30 to a nucleic acid molecule encoding an AGL8-like gene 
product . 
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BRIEF DESCRIPTION OF THE DRAWINGS 

• ■ * 

Figure 1 shows a scanning electron micrograph 
of an Arabidopsis gynoecium at about the time of 
pollination. A number of distinct cell types are shown, 

■ • • 

5 including the apical stigma, the style, and the ovary. 
The ovary walls, or valves, which are separated along 
their entire lengths by a small suture denoted the 
. "replum, " are indicated. The dehiscence zone, -a narrow 
band of cells one to three cells wide along the 
10 valve/replum boundary, also is indicated. 

Figure 2 shows a wild type Arabidopsis fruit 
immediately following pod shattering. The seeds as well 
as the replum are clearly visible. 

Figure 3 shows scanning electron micrographs of 
15 wild type Arabidopsis and a representative 35S::AGL8 
. transgenic line. The dehiscence zone is evident in the 
wild type plant. In contrast, in the 35S::AGL8 
transgenic line, the cells of the outer replum are 
converted to a valve cell fate, and the dehiscence zone 
20 is absent . 

Figure 4 shows the agi5 and agll genomic 
regions and the loss of AGL5 or AGLl expression, 
respectively, in the agl5 or agll mutant. Figure 4A 
shows the genomic structure of the AGL5 gene, with the 

25 positions of exons indicated by boxes, and the positions 
of introns indicated by thin lines. The agi5 mutant 
allele, generated by targeted disruption following 
homologous recombination, has a kanamycin resistance 
cassette that is indicated by a yellow hatched box and 

30 located within the, MADS-box region. • Figure 4B shows the 
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genomic structure of the AGLl gene, with the position of 
the approximately 17 kb T-DNA insertion into the large 
intron of the agll-1 locus indicated by the arrowhead . 
Exons are indicated by boxes. Introns are indicated by 
thin lines. The MADS-box region is shown as a hatched 
box. Figure 4C shows that a probe specific for the 3' 
end of the AGL5 complementary cDNA detected the AGL5 
transcript in wild type but not in the agl5 knockout 
mutant plants. Figure 4D shows that a probe specific. for 
the 3' end of the AGLl complementary DNA (cDNA). detected 
the AGLl transcript in wild type but not in the agll 
mutant generated by T-DNA insertion: 

Figure 5 shows scanning electron micrographs of 
wild type Arahidopsis and an agll agl5 double mutant. 
The valves are beginning to detach from the replum in the 
wild type Arahidopsis fruits, which are shown during the 
process of dehiscence- At the same time in development, 
the valves of the.agil agl5 double mutant plant remain 
attached to the replum. 

Figure 6 shows the nucleotide (SEQ ID N0:1) and 
amino acid (SEQ ID NO: 2) sequence of Arahidopsis AGL8 . 

Figure 7 shows the nucleotide sequence of the 
AraJbidppsis AGLl gene (SEQ ID N0:3). The. exons . and 
translation start site are indicated. 

Figure 8 shows the nucleotide sequence of the. 
Arahidopsis AGL5 gene (SEQ ID N9:4). The exons and 
translation start site are indicated. 



I 
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DRTATT.TC Q DESCRIPTION OF THE INVENTION 

The. present invention provides a non-naturally 
occurring seed plant that is characterized by delayed 
seed dispersal due to ectopic expression of a nucleic 
5 acid molecule encoding an AGL8-like gene product. The 
AGL8-like gene product. can have, for example, 
substantially the amino acid sequence of an AGL8 ortholog 
.such as Arabidopsis AGL8 (SEQ ID N0:2). 

The fruit, a complex structure unique to 
10 flowering plants, mediates the maturation, and dispersal 
of seeds. In most flowering plants, the fruit consists 
of the pericarp, which is derived from the ovary wall, 
and the seeds, which develop from fertilized ovules. 
Arabidopsis, which is typical of the more than 3000 
15 species of the Brassicaceae, produces fruit in which the 
two carpel valves (ovary walls) are joined to the replum, 
. a visible suture that divides the two carpels. The 

structure of an Arabidopsis gynoecium around the time of 
pollination, including the carpel valves and replum, is 
20 shown in Figure 1. . 

Pod dehiscence or shatter occurs late in fruit 
development in a wide spectrum of important plant crops, 
such as oilseed rape (Brassica napus L.) and is a process 
of economic importance that can lead to significant 

25 losses in seed yield. In oilseed rape, dehiscence 
involves the breakdown of cell wall material in a 
discrete cell layer known as the "dehiscence zone," which 
is a region of only one to three cells in width that 
extends along the entire length of the valve/replum 

30 boundary (Meakin and Roberts, J. Exp. Botany 41:995-1002 
(1990)). As the cells in the dehiscence zone separate 
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from one another, the valves detach from the replum, 
allowing seeds to be dispersed (see Figure 2) . 

The plant hormone ethylene is produced by 
developing seeds and appears. to be an important regulator 
5 of the dehiscence process. One line of evidence 
supporting a role for ethylene in regulation of 
dehiscence comes from studies of fruit ripening, which, 
like fruit dehiscence, is a process involving the 
breakdown of cell wall material. In fruit ripening, 
10 ethylene acts in part by activating cell wall degrading 
enzymes such as polygalacturonase (Theologis et al., 
Develop. Genetics 14:282-2QS (1993)): Moreover, in 
genetically modified tomato plants in which the ethylene 
response is blocked, such as transgenic tomato plants 

15 expressing antisense polygalacturonase, there is a 

>> 

significant, delay in fruit ripening (Lanahan et al.. The 
Plant Cell 6:521-530 (1994); Smith et al.. Nature 
334:724-726 (1988) ) . 

• ■ ■ . 

In dehiscence, ultrastructural changes that 
20 culminate in degradation of the middle lamella of 
dehiscence zone cell walls weaken rapeseed pods and 
eventually lead to pod shatter. As in fruit ripening, 
hydrolytic enzymes including polygalacturonases play a 

r ,. i • * 

role in this programmed breakdown. For example, in 
25 oilseed rape, a specific endo-polygalacturonase, RDPGl, 
is upregulated and expressed exclusively, in the 
dehiscence zone late in pod development (Petersen et al.. 
Plant Mol. Biol- 31:517-527 (1996), which is incorporated 
herein by reference) . Ethylene may regulate the activity 
30 of hydrolytic enzymes involved in the process of 

dehiscence as it does in fruit ripening (Meakin and 
Roberts, J, Exp, Botany 41 : inQ.l-l nn (1990) , which is 
incorporated herein by reference) . Yet, until' now, the 
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proteins that control the process of dehiscence, such as 
' those regulating the relevant hydrolytic enzymes, have 
eluded identification. . 

The present invention is directed to the 
5 surprising discovery that the AGL8 transcription factor 

* * « 

regulates the process of dehiscence. As disclosed 
herein, Arahidopsis plants were transformed with an AGL8 

R 

. cDNA under control of a 35S cauliflower mosaic .virus. 

4 

(CaMV) constitutive prompter such that AGL8 was 

10 ectopically expressed throughout the transformed plant. 
In particular, AGL8, which is normally expressed in the 
carpel valves, was ectopically expressed in the replum, 
which is a small strip of cells separating the two valves 
in a mature fruit. As a consequence of such ectopic 

15 expression, the replum of the fruit was absent, with the 
cells of the outer replum replaced by cells having 
characteristics of valve identity, demonstrating that, in 
this context, AGL8 expression is sufficient to specify 
valve cell fate. Furthermore, ectopic expression of the 

20 AGL8 cDNA produced a transgenic plant in which the 

dehiscence zone failed to develop normally, resulting in 
delayed seed dispersal (see Example I) . Whereas wild 
type Arabidopsis produced fruit that opened and released 
seeds on. or about . 14 days after pollination, transformed 

25 Arabidopsis ectopically expressing AGL8 produced fruit in 
which seed dispersal was postponed, or in which the seeds 
were never released unless the fruit was opened manually 
(see Figure 3). Thus, for the first time, seed plants 
were genetically modified to delay the natural process of 

30 dehiscence. 

The present invention also relates to the 
surprising discovery that an agll agl5 double mutant seed 
plant has a delayed seed dispersal phenotype that is 
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Strikingly similar to the AGL8 gain-of-f unction 
phenotype. As disclosed herein, loss-of -function 
mutations in the AG LI and AGL5 genes were produced by 
disruptive T-DNA insertion and homologous recombination 
5 (see Example II).. In the resulting agil agl5 double 
mutant plants, the dehiscence zone failed to develop 
normally, and the mature fruits did not undergo 
dehiscence {see Figure 5) . Thus, AGLl or AGL5 gene 
expression is required for development of the dehiscence 
10 zone. These results indicate that AGLl, AGL5 and AGL8 
regulate pod dehiscence and that manipulation of AGLl, 
AGL5 and AGL8 expression can allow the process of pod • 
shatter tp be controlled. 

Thus, the present invention provides a 
15 non-naturally occurring seed plant that is characterized 
by delayed seed dispersal due to ectopic expression of a 
nucleic acid molecule encoding an AGL8-like gene product. 
The AGL8-like gene product can have, for example, 
substantially the amino acid sequence of an AGL8 ortholpg 
20 such Arahidopsis AGL8 (SEQ ID N0:2). 

As used herein, the term "non-naturally 
occurring," when used in. reference to a seed plant, means 
a seed plant that has been genetically modified by man. 
A transgenic seed plant. of the invention, for example, is 

25 a non-naturally occurring seed plant that contains an 
exogenous nucleic acid molecule encoding an AGL8-like 
gene product and, therefore, has been genetically 
modified by man. In addition, a seed plant that 
contains, for example, a mutation in an endogenous 

30 AGL8-like gene product regulatory element or coding 
sequence as a result of calculated exposure to a 
mutagenic agent, such as a chemical mutagen, or an 
"insertional mutagen," such as a transposon, also is 
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considered a non-naturally occurring seed plants since it 
' has been genetically modified by man. In contrast, a 
seed plant containing only spontaneous or naturally 
occurring mutations is. not a "non-naturally occurring 
5 seed plant" as defined herein and, therefore, is not 

encompassed within the invention. One skilled in the art 
understands that, while a non-naturally occurring seed . 
plant typically has a nucleotide sequence that is altered 
as compared to a naturally occurring seed plant, a 
10 non-naturally occurring seed plant also can be 
genetically modified by man without altering its 
nucleotide sequence, for example, by modifying its 
methylation pattern. 

The term "ectopically, " as used herein in 

15 reference to expression of a nucleic acid molecule 
encoding an AGL8-like gene product, refers to an 
expression pattern that is distinct from the expression 
pattern in a wild type seed plant. Thus, one skilled in 
the art understands that ectopic expression of a nucleic 

20 acid encoding an AGL8-like gene product can refer to 

expression in a cell type other than a cell type in which 
the nucleic acid molecule normally is expressed, or at a 
time other than a time at which the nucleic acid molecule 
normally is expressed, or at a level other than the level 

25 at which the nucleic, acid molecule normally is expressed. 
In wild type Arabidopsis^ for example, AGL8 expression is 
normally restricted during the later stages of floral 
development to the carpel valves and is not seen in the 
replum, which is the small strip of cells separating the 

30 carpel valves. However, under control of a constitutive 
promoter such as the cauliflower mosaic virus 35S 
promoter, AGL8 is expressed in the replum and, 
\ additionally, is expressed at higher than normal levels 



♦ i 
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in other tissues such as valve margin and, thus, is 
ectopically expressed. 

The term "delayed, " as used herein in reference 
5 to the timing of seed dispersal in a fruit produced by a 
non-naturally occurring seed plant of the invention, 
means a significantly later time of seed dispersal as 
compared to the time seeds normally are dispersed from a 
corresponding seed plant lacking an ectopically expressed 

id nucleic acid molecule encoding an AGL8-like gene product. 
Thus, the term "delayed" is used broadly to encompass 
both seed dispersal that is significantly postponed as ♦ 
compared to the seed dispersal in a corresponding seed 
plant, and to seed dispersal that is completely 

15 precluded, such that fruits never release their seeds 
unless there is human or other intervention. 

»' 

It is recognized that there can be natural 
variation of the time of seed dispersal within a seed 
plant species or variety. However, a "delay" in the time 

20 of seed dispersal in a non-naturally occurring seed plant 
of the invention readily can be identified by sampling a 
population of the non-naturally occurring seed plants and 
determining that the normal distribution of seied 
dispersal times is significantly later, on average, than. 

25 the normal distribution of seed dispersal times in a 
population of the . corresponding seed plant species or 
variety that does not contain an ectopically expressed 
nucleic acid molecule encoding an AGL8-like gene product. 
Thus, production of non-naturally occurring seed plants ; 

30 of the invention provides a means to skew the normal 
distribution of the time of seed dispersal from 
pollination, such that seeds are dispersed, on average, 
at least about 1%, 2%, 5%, 10%, 30%, 50% or 100% later 
than in the corresponding seed plant species that does 
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not contain an ectopically expressed nucleic acid 
♦ molecule encoding an AGL8-like gene product. 

A delay in seed dispersal of even one. to two 
days can be valuable in increasing the amount of seed 
5 successfully harvested from a seed plant. In canola 

rapeseed, for example, dehiscence normally occurs about. 8 
weeks post-pollination. In a non-naturally occurring 
canola rapeseed that ectopically expresses an AGLB-like 
gene product, dehiscence can occur one to two days later 
10 than in the wild type variety, allowing a significantly 
greater percentage of the seed crop to be. harvested 
rather than lost through uncontrolled seed dispersal. 

The present invention relates to the use of 

15 nucleic acid molecules encoding particular "AGAMOUS-LIKE" 
or "AGL" gene products. AGAMOUS. (AG) is a floral organ 
identity gene, one of a related family of transcription 
factors that, in various combinations, specify the 
identity of the floral organs: the petals, sepals, 

20 stamens and carpels (Bowman et al., Devel . 112:1-20 

(1991) ;. Weigel and Meyerowitz, Cell 78:203-209 (1994) ; 
Yanof sky. Annual Rev, Plant Phvsiol . Mol. Biol. 
46:167^188 (1995)). The AGAMOUS gene product is 
essential for specification of carpel and stamen identity 

25 (Bowman et al.. The Plant Cell 1:37-52 (1989); Yanofsky 
et al.. Nature 346: 35-39 (1990) ) . Related genes have 
recently, been identified and denoted "AGAMOUS-LIKE" or 
"AGL" genes (Ma et al., Genes Devel, 5:484-495 (1991); 
Mandel and Yanofsky, The Plant Cell 7:1763-1771 (1995), . 

30 which is incorporated herein by reference) . 

AGL8, like AGAMOUS and other AGL genes,- is 
. characterized, in part, in that it is a plant MADS box 
gene. The plant MADS box genes generally encode proteins 
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. of about 260 amino acids including a highly conserved 
MADS domain of about 56 amino acids (Riechmann and 
MeyeirowitZ; Biol. Chem. 378 : 1079-1101 (1997) , which is 
incorporated herein by reference) . The MADS domain, 
5 which was first identified in the Arahidopsis AGAMOUS, and 
Antirrhimum ma jus DEFICIENS genes, is conserved among 
transcription factors found in humans (serum response 
factor; SRF) and yeast (MCMl; Norman et al.. Cell 
55:989-1003 (1988) ; Passmore et al., J. Mol, Biol, 
10 204:593-606 (1988), and is the most highly conserved 

region of the MADS domain proteins. The MADS domain is 

■ 

the major determinant of sequence specific DNA-binding 
activity and can also perform dimerization and other 
accessory functions (Huang ^et al., The Plant 
15 Cell 8:81-94 (1996)). The MADS domain frequently resides 
at the N-terminus, although some proteins contain 
additional residues N-terminal to the MADS domain. 

. The "intervening domain" or "I-domain," located 
immediately C-terminal to the MADS domain, is a weakly 

20 conserved domain having a viariable length of 

approximately 30 amino acids (Purugganan et al.. Genetics 
140:345-356 (1995)), In some proteins, the I-domain. 
plays a role, in the formation of DNA-binding dimers. A 
third domain present in plant MADS domain proteins is a . 

25 moderately conserved 70 amino acid region denoted the 
"keratin-like domain" or "K-domain." Named for its 
similarity to regions of the keratin molecule, the 
structure of the K-domain appears capable of forming 
amphipathic helices and may mediate protein-protein 

30 interactions (Ma et al.. Genes Devel , 5:484-495 (1991)). 
The most variable domain, both in sequence and in length, 
is the carboxy-terminal or "C-domain" of the MADS domain 
proteins. Dispensable for DNA binding and protein 
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dimerization in some MADS domain proteins, the function 
, of this C-domain remains unknown. 

• • • 

Arabidopsis AGL8 is a 242 amino acid MADS box 
5 protein (see Figure 6; SiEQ ID NO: 2; Mandel and Yanofsky, . 
supra, 1995).- The AGL8 MADS domain resides at amino 
acids 2 to 56 of SEQ ID N0:2. The K-doraain of AGL8 
resides at amino acids 92 to 158 of SEQ ID N0:2. 

In. wild-type Arabidopsis, AGL8 RNA accumulates. 

10 in two distinct phases, the first occurring during 

inflorescence development in the stem and. cauline leaves 
and the second in the later stages of flower development 
(Mandel and Yanofsky, supra, 1995) . In particular, AGL8 
RNA is first detected in the inflorescence meristem as 

15 soon as the plant switches from vegetative to 

reproductive development. As the inflorescence stem 
elongates, AGL8 RNA accumulates in the inflorescence 
meristem and in the stem. Secondly, although AGL8 is not 
detected in the initial stages (1 and 2). of flower 

20 development, AGL8 expression resumes at approximately 
stage 3 in the center of the floral dome in the region 
corresponding to the fourth (carpel) whorl. AGL8 
expression is excluded from all other primordia and the 
pedicel.. The time of AGL8 expression in the fourth 

a 

25 carpel whorl generally corresponds to the time at which 
the organ identity genes APETALA3, PISTILLATA AND AGAMOUS 
begin to be expressed (Yanofsky et al.. Nature 346:35-39 
(1990); Drews et al., £^ 65:991-1002 (1991); Jack et 
al.. Cell 68:683-697 (1992); Goto and Meyerowitz, Genes 

30 P^v^l. 8:1548-1560 (1994)). At later stages, AGL8 

expression becomes localized to the carpel walls, in the 
region that constitutes the valves of the ovary, and is 
absent from nearly all other cell types of the carpel. 
No RNA expression is. detected in the ovules. 



wo 99/00502 PCTAJS98/13208 

* ■ • 

17 

stigmatic tissues or the septum that divides the ovary. 
Thus, in nature, AGL8 expression during the later stages 
of floral development is restricted to the valves of the 
carpels and to the cells within the style. 

» 

5 As used herein, the term "AGL8-like gene 

product" means a gene product that has the same or 
similar function as Arahidopsis AGL8 such that, when 
ectopically expressed in a seed plant, the normal 
development of the dehiscence zone is altered, and seed 

10 dispersal is delayed. An AGL8-like gene product can 
have, for example, • the ability to convert cells of the 
outer replum to a valve cell identity. Arahidopsis AGL8 
(SEQ ID NO: 2) is an example of an AGL8-like gene product 
as defined herein. As disclosed in Example I, ectopic 

15 expression of Arahidopsis AGL8 (SEQ ID NO: 2) under 
control of a tandem CaMV 35S promoter, in which the 
intrinsic promoter element has been duplicated, alters 
formation of- the dehiscence zone, thereby resulting in 
fruit characterized by a complete lack of seed dispersal. 

20 An AGLB-like gene product also can be characterized, in 
part, by its ability to interact with AGLl arid, 
additionally, its ability to interact with AGL5. 

An AGL8-like gene product generally is 
characterized, in part, by having an amino acid sequence 
25 that has at least about 50% amino acid identity with the 
amino acid sequence of Arahidopsis AGL8 (SEQ ID NO: 2) . 
An AGL8-like gene product can have, for example, an amino 
acid sequence with greater than about 65% amino acid 
sequence identity with Arahidopsis AGL8 (SEQ ID N0:2), 

30 preferably greater than about 75% amino acid identity 

■ 

with Arahidopsis AGL8 (SEQ ID N0;2), more preferably 
. greater than about 85% amino, acid identity with 
Arahidopsis AGL8 (SEQ ID NO: 2), and can be a sequence 
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. having greater ' than about 90%, 95% or 97% amino acid 
identity with Arabidopsis AGL8 (SEQ ID N0:2). 

Preferably, an AGL8-like gene product is, 

* 

5 orthologous to the seed plant species in which it. is 

ectopically expressed. A nucleic acid molecule encoding 
Arabidopsis AGL8 (SEQ ID NO: 2), for example, can be 
ectopically expressed in an Arabidopsis plant to produce 
a non-naturally occurring Arabidopsis variety 
10 characterized by delayed seed dispersal. Similarly, a 
nucleic acid molecule encoding canola AGL8 can be 
ectopically expressed in a canola plant to produce a 
non-naturally occurring canola variety characterized by 
delayed seed dispersal. 

15 . A nucleic acid molecule encoding an AGL8-like 

gene product also can be ectopically expressed in a 
heterologous seed plant to produce a non-naturally 
occurring seed plant characterized by delayed seed 
dispersal. AGAMOUS-like gene products have been widely 

20 conserved throughout the plant kingdom; for example, 
AGAMOUS has been conserved in tomato (TAGl) and maize 
(ZAGl), indicating that orthologs of AGAMOUS-like genes 
are present in most, if not all, angiosperms (Pnueli et 
al.. The Plant Cell 6: 163-173 (1994) ; Schmidt et al., The 

25 Plant- Cell 5:729-737 (1993) ). AGL8-like gene products ' 
such as AGL8 orthologs also can be conserved and can 
function across species boundaries to delay seed 
dispersal. Thus, ectopic expression of a nucleic acid 
molecule encoding Arabidopsis AGL8 (SEQ ID NO: 2) in a 

30 heterologous seed plant within the Brassicaceae such as 
Brassica napus L. (rapeseed) or within the Fabaceae such 
as in Glycine (soybean) can alter normal development of 
the dehiscence zone, thereby resulting in delayed seed 
dispersal. Furthermore, a nucleic acid molecule encoding 
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Arahidopsis AGL8 (SEQ ID N0:2), for example, can be 
. ectopically expressed in more distantly related 
heterologous seed plants, including dehiscent seed plants 
as well as other dicotyledonous and monocotyledonous 
5 angiosperms and gymnosperms and, upon ectopic expression, 
can alter normal development of the dehiscence zone and 
delay seed dispersal in the heterologous seed plant. 

As used herein, the term "AGLSrlike gene 
product" encompasses an active segment of an AGL8-like 

10 gene product, which is a polypeptide portion of an 

AGL8-like gene product , that, when ectopically expressed, 
alters normal development of the dehiscence zone and 
delays seed dispersal. An active segment can be, for 
example, an amino terminal, internal or carboxy terminal 

15 fragment of Arahidopsis AGL8 (SEQ ID NO: 2) that, when 
ectopically expressed in a seed plant, alters normal 
development of the dehiscence zone and delays seed 
dispersal. An active segment of an AGL8-like gene 
product can include, for example, the MADS domain and can 

20 have the ability to bind DNA specifically. The skilled 
artisan will recognize that a nucleic acid molecule 
encoding an active segment of an AGL8-like gene product 
can be. useful in producing a seed plant of the invention 
characterized by delayed seed dispersal and in the 

25 related methods and kits of the invention described 
further below. 

An active segment of an AGLS-like gene product 
can be identified using the methods described in 
Example I or using other routine methodology. Briefly, a 
30 seed plant such as Arahidopsis can be transformed with a 
nucleic, acid molecule under control of a constitutive 
regulatory element such as a tandem CaMV 353 promoter. 
Phenotypic analysis of the seed plant reveals whether a 
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seed plant ectbpically expressing a particular 
polypeptide portion is characterized by delayed seed, 
dispersal. In transgenic plants in which seed dispersal 
is delayed, further analysis can be performed to confirm 
5 that normal development of the dehiscence zone has been 
altered. For analysis of a large number of polypeptide 
portions of an AGL8-like gene product, nucleic acid 
molecules encoding the polypeptide portions can be 
assayed in pools, and active pools subsequently 
10 subdivided to identify the active nucleic acid molecule. 

In one embodiment, the invention provides a , 
non-naturally occurring seed plant that is characterized 
by delayed seed dispersal due to ectopic expression of a 
nucleic acid molecule encoding an AGL8-like gene product 

15 having substantially the amino acid sequence of an AGL8 
ortholog. As used herein, the term "AGL8 ortholog" means 
an ortholog of Arabidopsis AGL8 (SEQ ID NO: 2) and refers 
to an AGL8-like gene product. that, in a particular seed 
plant variety, has the highest percentage homology at the 

20 amino acid level to Arabidopsis AGL8 (SEQ ID N0:2) . An 
AGL8 ortholog can be, for example, a Brass ica AGL8 
ortholog such as a Brassica napus L. AGL8 ortholog, or a 
Fabacea AGL8 ortholog such as a soybean, pea, lentil, or 
bean AGL8 ortholog. An AGL8 ortholog from the long-day . 

25 plant Sinapis alba, designated SaMADS B, has been 
described (Menzel-et al.. Plant J. 9:399-408 (1996), 
which is incorporated herein by reference) . Novel AGL8 
ortholog cDNAs can be isolated from additional seed plant 
species using a nucleotide sequence as a probe and 

30 methods well known in the art of molecular biology (Glick 
and Thompson (eds.). Methods in Plant Molecular Biology 
and Biot echnology, Boca Raton, FL: CRC Press (1993); 
Sambrook et al. (eds, V Molecula r Cloning: A Laboratory 
Manual (Second Edition), Plainview, NY: Cold Spring 
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Harbor Laboratory Press (1989), each of which is 
incorporated herein by reference) . 



As used herein, the term "substantially the 
amino acid sequence," when used in reference to an AGL8 
5 ortholog, . is intended to mean a polypeptide or 

polypeptide segment having an identical amino acid 
sequence, or a polypeptide or polypeptide segment having 

» 

a similar, non-identical sequence that is considered, by 

* 

those skilled in the art to be a functionally equivalent 

10 amino acid sequence- For example, an AGL8-like gene 

product having substantially the amino acid sequence of 
Arabidopsis AGL8 can have an amino acid sequence 
identical to the sequence of Arabidopsis AGL8 (SEQ ID 
N0:2) shown in Figure 6, or a similar, non-identical 

15 sequence that is functionally equivalent. In particular, 
an amino acid sequence that is "substantially the amino 
acid sequence" of AGL8 can have one or more modifications 
such as amino acid additions, deletions or substitutions 
relative to the AGL8 amino acid sequence shown (SEQ ID 

20 N0:2), provided that the modified polypeptide retains 

substantially the ability to alter normal development of 
the dehiscence zone and delay seed dispersal when 
ectopically expressed in the seed plant. Comparison of 
sequences for substantial similarity can be performed 

25 between two sequences of any length and usually is 
performed with sequences between about 6 and 1200. 
residues, preferably between about 10 and 100 residues 
and more preferably between about 25 and 35 residues. 

ft , _ 

Such comparisons for substantial similarity are performed 
30 using methodology routine in the art-. 

It is understood that minor modifications of 
primary amino acid sequence can result in an AGL8-like 
gene product that has substantially equivalent or 
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enhanced function as compared to the AGL8 ortholog from 
which it was derived. Further, various molecules can be 
attached to an AGL8 ortholog or active segment thereof, 
for example, other polypeptides, antigenic or other 
. 5 peptide tags, carbohydrates,, lipids, or chemical 

moieties. Such modifications are included within the 
term AGL8 ortholog as defined herein. 

One or more point mutations can be introduced 
into a nucleic acid molecule encoding an AGL8 ortholog to 

10 yield a modified nucleic acid molecule using, for 

example, site-directed mutagenesis (see Wu (Ed.)/ Meth. , 
■In Enzymol> Vol. 217, San Diego: Academic Press (1993); 
Higuchi, "Recombinant PGR". in Innis et al. (Ed.), PGR 
Protocols . San Diego: Academic Press, Inc. (1990), each 

15 of which is incorporated herein by reference) . . Such 

mutagenesis can be used to introduce a specific, desired 
amino acid insertion, deletion or substitution; 
alternatively, a nucleic acid sequence can be synthesized 
having random nucleotides at one or more predetermined 

20 positions to generate random amino acid substitutions. 
Scanning mutagenesis also can be useful in generating a 
modified nucleic acid molecule encoding substantially the 
amino acid sequence of an AGL8 ortholog - 

Modified nucleic acid molecules can be 
25 routinely assayed. for the ability to alter normal 

development of the dehiscence zone and to delay seed 
dispersal. In the same manner as described in Examples I 
and III, a nucleic acid molecule encoding substantially : 
the amino acid sequence of aii AGL8 ortholog can be 
30 ectopically expressed, for example, using a constitutive 
regulatory element such as the CaMV 35S promoter or using, 
a dehiscence zone-selective regulatory element such as 
the AGLl promoter. If such ectopic expression results in 
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a seed plant in which the dehiscence zone fails to 
, develop and in which seed dispersal is delayed, the 

modified polypeptide or. segment is an "AGL8 ortholog" as 
defined herein. 

.5 A non-naturaily occurring seed plant of the 

invention that is characterized by delayed seed dispersal 
can be one of a variety of seed plant species, such as a 
dehiscent seed plant or another monocotyledonous and 
dicotyledonous angiosperm or gymnosperm. A useful seed 

10 plant of the invention can be a dehiscent seed plant, and 
a particularly useful seed plant of the invention can be 
a member of the Brassicaceae, such as rapeseed, or a 
member of the Fabaceae, such as a soybean, pea, lentil or 
bean plant. 

15 .As used herein, the term "seed plant" means an 

angiosperm or gymnosperm. An angiosperm is a 
seed-bearing plant whose seeds are borne in a mature 
ovary (fruit) . An angiosperm commonly is recognized as a 
flowering plant. Angiosperms are divided into two broad 

20 classes based on the number of cotyledons, which are seed 
leaves that generally store or absorb food. Thus, a 
monocotyledonous angiosperm is an angiosperm having a 
single cotyledon, . whereas a dicotyledonous angiosperm is 
an angiosperm having two cotyledons. A variety of 

25 angiosperms are known including, for example, oilseed 
plants, leguminous plants, fruit-bearing plants, 
ornamental flowers, cereal plants and hardwood trees, 
which general classes are not necessarily exclusive. The 
skilled artisan will recognize that the methods of the 

30 invention can be practiced using these or other 

angiosperms, as desired. A gymnosperm is a seed-^bearing 
. plant with seeds not enclosed .in an ovary. 
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In one embQdiment , the invention provides a 
. non-naturally occurring dehiscent seed plant that is 
characterized by delayed seed dispersal due to ectopic 
expression of a nucleic acid molecule encoding an 
5- AGLS-like gene product in the dehiscent seed plant. As 
used herein, the term "dehiscent seed plant" means a seed 
plant that produces a dry dehiscent fruit,, which has 
fruit walls that open to permit escape of the seeds 
contained therein. Dehiscent fruits commonly contain 
1.0 several seeds and include the fruits known, for example, 
as legumes, capsules and siliques. 

In one embodiment, the invention provides a 
non-naturally occurring seed plant that is characterized 
by delayed seed dispersal due to ectopic expression of a 

15 nucleic acid molecule encoding an AGL8-like gene product, 
where the seed plant is a member, of the Brassicaceae. 
The Brassicaceae, commonly known as the Brassicas, are a 
diverse group of crop plants with great economic value 
worldwide (see, for example, Williams and Hill, Science 

20 232:1385-1389. (1986), which is incorporated herein by 
reference). . The Brassicaceae produce seed oils for 
margarine, salad oil, cooking oil, plastic and industrial 
uses; condiment mustard; leafy, stored, processed and 
pickled vegetables; animal fodders and green manures for 

25 soil rejuvenation. A particularly useful non-naturally 
occurring Brassica seed plant of the invention is the 
oilseed plant canola. 

There are six major Brassica species of 
economic importance, each containing a range of plant 
30 forms. Brassica napus includes plants such as the 

oilseed rapes and rutabaga. Brassica oleracea are the 
cole crops such as cabbage, cauliflower, kale, kohlrabi 
and Brussels sprouts. Brassica campestris (Brassica 
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rapa) includes plants such as Chinese cabbage, turnip and 
pak choi. Brasslca juncea includes a variety of ^ 
mustards; Brassica nigra is the black mustard; and 
Brassica carinata is Ethiopian mustard. The skilled. 
5 artisan understands that any member of the Brassicaceae 
can be modified as disclosed herein to produce a 
non-naturally occurring Brassica plant characterized by 
delayed seed dispersal. 

In a second embodiment, the. invention provides 

10 .a non-naturally occurring seed plant that is 

characterized by delayed seed dispersal due to ectopic ' 
expression, of a nucleic acid molecule encoding an 
AGL8-like gene product, where the seed plant is a member 
of the Fabaceae. The Fabaceae, which are commonly known 

15 as members of the pea family, are seed plants that 

produce a characteristic dry dehiscent fruit known as a 
legume. The legume is derived from a single carpel and 
dehisces along the suture of the carpel margins and along 
the median vein. The Fabaceae encompass both grain 

20 legumes and forage legumes. Grain legumes include, for 
example,, soybean {glycine) , pea, chickpea, moth bean, 
broad bean, kidney bean, lima bean, lentil, cowpea, dry 
bean and peanut. Forage legumes include alfalfa, 
lucerne, birdsfoot trefoil, clover, stylosanthes species> 

25 lotoiionis. Jbai/ie55ii and sainfoin. The skilled artisan 
will recognize that any member of the Fabaceae can be 
modified as disclosed herein to produce a non-naturally 
occurring seed plant of the invention characterized by . 
delayed seed dispersal. 

30 A non-naturally occurring seed plant of the 

invention characterized by delayed seed dispersal alsp 
can be a member of the plant genus Cuphea (family 
Lythraceae) . A Cuphea seed plant is particularly 
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valuable since Cuphea oilseeds contain industrially and 
' nutritionally important medium-chain fatty. acids, 
especially lauric acid, which is currently supplied only 
by coconut and palm kernel oils. 

5 A non-naturally occurring seed plant of the 

invention also can be, for example, one of the 

■ * * ' 

♦ 

monocotyledonous grasses, which produce many. of the 
valuable small-grain qereal crops of the. world.. In a 
non-naturally occurring small grain cereal plant of the 

10 invention, grain remains on the seed plant longer and. 

Ectopic expression of a nucleic acid molecule encoding an 
AGL8-like gene product, or suppression of AGLl and AGL5 
expression as described below, can be useful in 
generating a non-naturally occurring small grain cereal 

15 plant, such as a barley, wheat, oat, rye, orchard grass, 
guinea grass, sorghum or turf grass plant characterized 
by delayed seed dispersal. 

The invention also provides a transgenic seed 
plant that is characterized by delayed seed dispersal due 

20 to ectopic expression of a nucleic acid molecule encoding 
an AGL8-like gene product. In a transgenic seed plant of 
the invention, the ectopically expressed nucleic acid 
molecule encoding . an AGL8-like gene product can be 
operatively linked to an exogenous regulatory element. 

25 The invention provides, for example, a transgenic seed 
plant characterized by delayed seed dispersal having an 
ectopically expressed nucleic acid molecule encoding an 
AGLB-like gene product, that is operatively linked to an 
exogenous constitutive regulatory element. In one 

30 embodiment, the invention provides a transgenic seed 
plant that is characterized by delayed seed dispersal 
due to ectopic expression of an exogenous nucleic acid 
molecule encoding substantially the amino acid sequence 
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of an AGL8 ortholog operatively linked to an exogenous 
cauliflower mosaic virus 35S promoter. 

The invention also provides a transgenic, seed 
plant that is characterized by delayed seed dispersal 
5 due to ectopic expression of a nucleic acid molecule 
.encoding an AGL8-like gene product operatively linked to 
a dehiscence zone-selective regulatory element. The 
dehiscence zone-selective regulatory element can be, for 
example, an AGLI regulatory element or AGL5 regulatory 

10 element. The AGLl regulatory element can be derived from 
the Arabidopsls AGLl genomic sequence disclosed herein as • 
SEQ ID NO:. 3 and can be, for example, a 5' regulatory 
sequence or intronic regulatory element. Similarly, the 
AGL5 regulatory element can be derived from the 

15 Arabidopsls AGL5 genomic sequence disclosed herein as SEQ 
ID NO: 4 and can be, for example, a 5' regulatory sequence, 
or intronic regulatory element. 

In one embodiment, a transgenic seed plant of 
the invention has an ectopically expressed exogenous 

20 nucleic acid molecule encoding substantially the amino 

acid sequence of an AGL8 ortholog operatively linked to a 
dehiscence zone-selective regulatory element that is an 
AGLl regulatory element having at least fifteen 
contiguous nucleotides of nucleotides . 1 to 2599 of SEQ ID 

25 NO: 3; nucleotides 2833 to 4128 of SEQ ID N0:3; 

nucleotides 4211 to 4363 of SEQ ID NO: 3; nucleotides 4426 . 
to 4554 of SEQ ID NO: 3; nucleotides 4796 to 4878 of SEQ 
ID NO: 3; nucleotides 4921 to 5028 of SEQ ID NO: 3; or 
nucleotides 5421 to 5682 of SEQ ID NO: 3. 

30 In another embodiment, a transgenic seed plant 

of the invention has ain ectopically expressed exogenous 
nucleic acid molecule encoding substantially the amino 
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acid sequence of an AGL8 ortholog operatively linked to a 
dehiscence zone-selective regulatory element that is an 
AGL5 regulatory element having at least fifteen 
contiguous nucleotides of nucleotides 1 to 1890 of SEQ ID 
5 N0:4; nucleotides 2536 to 2683 of SEQ ID N0:4; 

nucleotides 2928 to 5002 of SEQ ID NO: 4; nucleotides 5085 
to 5204 of SEQ ID NO: 4; nucleotides 5367 to 5453 of SEQ 
ID NO: 4; nucleotides 5645 to 5734 of SEQ ID NO: 4; or 
nucleotides 6062 to 6138 of SEQ ID NO: 4. 

10 . As used herein, the term "transgenic" refers to 

a seed plant that contains an exogenous nucleic acid » 
molecule, which can be derived from the .same seed plant 
species or a heterologous seed plant species. 

The term "exogenous," as used herein in 
15 reference to a nucleic acid molecule and a transgenic 
seed plant, means a nucleic acid molecule originating 
from outside the seed plant. An exogenous nucleic acid 
molecule can be, for example, a nucleic acid molecule 
encoding an AGL8-like gene product or an exogenous 
20 regulatory element such as a constitutive regulatory 
element or a dehiscence zone-selective regulatory 
element, as described further below. An exogenous 
nucleic acid molecule can have a naturally occurring or . 
non-naturally occurring nucleotide sequence and can be a 
25 heterologous nucleic acid molecule derived from a 

different seed plant species than the seed plant into . 
which the nucleic acid molecule is introduced or can be a 
nucleic acid molecule derived from the same seed plant 
species as the seed plant into which it is introduced. 

30 The term "operatively linked, " as used in 

reference to a regulatory element and a nucleic acid 
molecule, means that the regulatory element confers 
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regulated expression upon the operatively linked nucleic 
• acid molecule. Thus, the term "operatively linked," as 
used in reference to an exogenous regulatory element such 
as a dehiscence zone-selective regulatory element and a 
5 nucleic acid molecule encoding an AGL8-like gene product, 
means that the dehiscence zone-selective regulatory 
element is linked to the nucleic acid molecule encoding. 

» 

an AGL8-like gene product such that the expression 
pattern of the dehiscence zone-selective, regulatory . 

10 element is conferred upon the nucleic acid molecule 

encoding the AGL8-like gene product. It is recognized 
that a regulatory element and a nucleic acid molecule 
that are operatively linked have, at a minimum, all 
elements essential for transcription, including, for 

15 example, a TATA box. 

As used herein, the term "constitutive 
regulatory element" means a regulatory element that 
confers a level of expression upon an operatively linked 
nucleic molecule that is relatively independent of the 
20 cell or tissue type in which the constitutive regulatory 
element is expressed. A constitutive regulatory element 
that is expressed in a seed plant generally is widely 
expressed in a large number of cell and tissue types. 

25 A variety of constitutive regulatory elements 

useful for ectopic expression in a transgenic seed plant 
are well, known in the art. The cauliflower mosaic 
virus 35S (CaMV 35S) promoter, for example, is a 
well-characterized constitutive regulatory element that 

30 produces a high level of expression in all plant tissues 
(Odell et al.. Nature 313:810-812 (1985)). The CaMV 35S 
promoter can be particularly useful due. to its activity 
in numerous diverse seed plant species (Benfey and Chua, 
Science 250:959-966 (1990); Futterer et al., Physiol. 
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*. Plant 79:154 (f990) ; Odell et al., supra, 1985). A 
tandem 35S promoter, in which the intrinsic promoter . 
element has been duplicated, confers higher expression 
levels in comparison to the unmodified 35S promoter. (Kay 
5 et al., Science 236:1299 (1987)). Other constitutive 
regulatory elements useful for ectopically expressing a 
nucleic acid molecule encoding an AGL8~like gene product 
in a transgenic seed plant of the invention include, for 
example, the cauliflower mosaic virus 19S promoter; the 
10 Figwprt mosaic virus promoter; and the nopaline synthase 
,{nos) gene promoter (Singer et al.. Plant Mol. 
Biol. 14:433 (1990); An, Plant Phvainl . 81:86 (1986)). 

Additional constitutive regulatory elements 
including those for efficient ectopic expression in 

15 mpnocots also are known in the art, for example, the pEmu 
promoter and promoters based on the rice Actin-1 
5* region (Last et al., Theor. Appl. Genet. 81:581 
(1991); Mcelroy et al., Mol. Gen. Genet, 231:150 (1991); 
Mcelroy et al.. Plant Cell 2:163 (1990)). Chimeric 

20 regulatory elements, which combine elements from 

different genes, also can be useful for ectopically 
expressing a nucleic acid molecule encoding an AGL8-like 
gene product. (Comai et al.. Plant Mol. Biol. 15:373 
(1990)), One skilled in the art understands that a 

25 particular constitutive regulatory element is chosen 
based, in part, on the seed plant species in which a 
nucleic acid molecule encoding an AGL8-like gene product 
is to be ectopically expressed and on the desired level 
of expression. 

30 An exogenous regulatory element useful in a 

transgenic seed plant of the invention also can be an 
inducible regulatory element, which is a regulatory 
element that confers conditional expression upon an 
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operatively linked nucleic acid molecule, where 
. expression of the operatively linked nucleic acid 
molecule is increased in the presence of . a particular 
inducing agent or stimulus as compared to expression of 
5 ' the nucleic acid molecule in the absence of the inducing . 
agent or stimulus. Particularly useful inducible 
regulatory elements include copper-inducible regulatory, 
elements (Mett et al., Proc. Na tl. Acad. Sci. 
U2h 90:4567-4571 (1993); Furst et al.. Cell 55:70.5-717 

10 (1988)); tetracycline and chlor-tetracycline-inducible 
regulatory elements (Gatz et al.. Plant' J. 2:397-404 
(1992); Rbder et al., Mol, Gen. Genet. 243:32-38 (1994); 
Gatz, Meth'. Cell Biol, 50:411-424 (1995)); ecdyspne 
inducible regulatory elements (Christopherson et al., 

15 Proc. Natl. Acad. Sci, USA 89:6314-6318 (1992); 

Kreutzweiser et al., Ecotoxicol. Environ. Safety 28:14-24 
(1994)); heat shock inducible regulatory elements 
(Takahashi et al.. Plant Physiol. 99:383-390 (1992); Yabe 
et al.. Plant Cell Phvsio] . 35 : 1207-1219 (1994 ) ; Ueda et 

20 al., Mol, Gen , Genet. 250:533-539 (1996)); and lac operon 
elements, which are used in combination with a 
constitutively expressed lac repressor to confer, for 
example, IPTG-inducible expression (Wilde et al., 
EMBQ j, 11:1251-1259 (1992)). 

25 An inducible regulatory element useful in the 

transgenic seed plants of the invention also can be, for 
example, a nitrate-inducible promoter derived from the 
spinach nitrite reductase gene (Back et al.. Plant Mol, 
fiifiL., 17:9 (1991)) or a light-inducible promoter, such as 

30 that associated with the small subunit of RuBP 

carboxylase or the LHCP gene families (Feinbaum et al., 
Mol, Gen, Genet. 226:449.(1991); Lam and Chua, 
Science 248:471 (1990)). Additional inducible regulatory 
elements include salicylic acid inducible regulatory 
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. elements (Uknes et al.. Plant Cell 5:159-169 (1993); Bi 
et al.. Plant J. 8 : 235-245 (1995) ) ; plant 
hormone-inducible regulatory elements 
(Yamaguchi-Shinozaki et al., Plant Mol, Biol. 15:905 
.5 (1990); Kares et al., Plant Mol. Biol. 15:225 (1990)); 
and human hormone-inducible regulatory elements such as 
the human glucocorticoid response element (Schena et. al. , 
Proc, Natl. Acad, Sci. USA 88:10421 (1991)). 

It should be recognized that a non-naturally 
10 occurring seed plant of the invention, which contains an 
ectopically expressed nucleic acid molecule encoding an , 
AGLS-like gene product, also can contain one or more 
additional modifications, including, naturally and 
non-naturally occurring modifications, that can modulate 

. 15 the delay in seed dispersal. For example, the plant 

hormone ethylene promotes fruit dehiscence, and modified 
expression or activity of positive or negative regulators 
of the ethylene response can be included in a seed plant 
of the invention (see, generally, Meakin and Roberts, »L. 

.20 Exp. Botany 41:1003-1011 (1990); Ecker, Science 

268:667-675 (1995); Chao et al.. Cell 89:1133-1144 
(1997)). 

Mutations in positive reigulators of the 
ethylene response show a reduction or absence of 

25 responsiveness to treatment with exogenous ethylene. 
Arabidopsis mutations in positive regulators of the 
ethylene response include mutations in etr, which 
inactivate a histidine kinase ethylene receptor (Bleeker 
et al.. Science 241:1086-1089 (1988); Schaller and 

30 Bleeker, Science 270:1809-1811 (1995)); ers (Hua et al.. 
Science 269:1712-1714 (1995)); ein2 (Guzman and Ecker, 
Plant Cell 2:513 (1990)).; ein3 (Rothenberg and Ecker, 
Sem- Dev- Biol, Plant Dev, Genet, 4:3-13 (1993); Kieber 
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and Ecker, Trends Genet. 9:356-362 (1993)); ainl (van der • 
. Straeten et al., Plant Phvsiol. 102:401-408 (1993)); eti . 
(Harpham et al.. An, Bot. 68:55 (1991)) and eiii4, ei/35, 
ein6, and ein7 (Roman et al., Genetics 139:. 1393-1409 
5 (1995)). Similar genetic functions are found in other 
seed plant species; for example, the never-ripe mutation 
corresponds to etr and confers ethylene insensitivity in 
tomato (Lanahan et al., The Plant Cell 6:521^530 (1994); 
Wilkinson et al.. Science 270:1807-1809 (1995))-. A seed 

10 plant of the invention can include a modification that 
results in altered expression or activity of any such 
positive regulator of the ethylene response. A mutation 
in a positive regulator, for example, can be included in 
a seed plant of the invention and can modify the delay in 

15 seed dispersal in such plants, for example, by further- 
postponing the delay in seed dispersal. 

» 

Mutations in negative regulators of the 
ethylene response display ethylene responsiveness in the 
absence of exogenous ethylene. Such mutations include 

20 those relating to ethylene overproduction, for example, 
the etolf eto2, and eto3 mutants, and those relating to 
constitutive activation of the ethylene signalling 
pathway, for example, mutations in CTRl, a negative 
regulator with sequence similarity to the Raf family of 

25 protein kinases . (Kieber et al., Cell 72:427-441 (1993), 
which is incorporated herein by reference) . A seed plant 
of the invention can include a modification that results 
in altered expression or activity of any such negative 
regulator of the ethylene response. A mutation resulting 

30 in ethylene responsiveness in the absence of exogenous 
ethylene, for example, can be included in a non-naturally 
occurring seed plant of the invention and can modify, for 
example, diminish, the delay in seed dispersal. 
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Fruit morphological mutations also can be 
included in a seed plant of the invention. Such 
mutations include those in carpel identity genes such as 
AGAMOUS (Bowman et al., supra,. 1989; Yanofsky et al., 
5 supra, 1990) and in genes required, for normal fruit 

development such as ETTIN, CRABS CLAW, SPATULA, AGL8 and 
TOUSLED (Sessions et al., Development 121:1519-1532 
(1995); Alvarez and Smyth, Flowering Newsletter 23:12-17 
(1997); and Roe et al . , Cell 75:939-950 (1993)). Thus, 
10 it is understood that a seed plant of. the invention 
.having an ectopically expressed nucleic acid molecule 
encoding an AGL8-like gene product can include one or • 
more additional genetic modifications, which can diminish 
or enhance the delay in seed dispersal. 

15 The present invention also provides methods of 

»». 

producing a non-naturally occurring seed plant 
characterized by delayed seed dispersal. A method of the 
invention entails ectopically expressing a nucleic acid ■ • 
molecule encoding an AGL8-like gene product in the seed 
20 plant, whereby seed dispersal is delayed due to ectopic 
expression of the nucleic acid molecule. 

As discussed above, the term "ectopically" 
refers to expression of a nucleic acid molecule encoding 
an AGLB-like gene product ina cell type other than a 

25 cell type in which the nucleic acid molecule is normally 
expressed, at a time other than a time at which the 
nucleic acid molecule is normally expressed or at n 
expression level other than the level at which the 
nucleic acid normally is expressed. In wild type 

30 Arahidopsis, for example, AGL8 expression is normally 

restricted during the later stages of floral development 
to the carpel valves and is not seen in the outer replum. 
In the methods of the invention, particularly useful 
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ectopic expression of a nucleic acid molecule encoding an 
. AGL8-like gene product involves expression in the cells 
of the outer replum, which are the progenitors of the 
dehiscence zone. 

5 Actual ectopic expression of an AGL8-like gene 

product is dependent on various factors. The ectopic 
expression can be widespread expression throughout most 
or all plant tissues or can be expression restricted. to a 
small number of plant tissues, and can be achieved by a 

10 variety of routine techniques. . Mutagenesis, including 
seed or pollen mutagenesis, can be used tp generate a 
non-naturally occurring seed plant, in which a nucleic 
acid molecule encoding an AGL8-like gene product is 
ectopically expressed. Ethylmethane sulfonate (EMS) 

15 mutagenesis, transposon mediated mutagenesis or T-DNA 
mediated mutagenesis also can be. useful in ectopically 
expressing an AGLB-like gene product to produce a seed 
plant characterized by delayed seed dispersal (see, 
generally, Glick and Thompson, supra, 1993). While not 

20 wishing to be bound by any particular mechanism, ectopic 
expression in a mutagenized plant can result from 
inactiyation of one or more negative regulators of AGL8, 
for example, from the combined inactivation of AGLl and 
AGL5. 

25 Ectopic expression of an AGL8-like gene product 

also can.be achieved by expression of a nucleic acid 
encoding an AGL8-like gene product from a heterologous 
regulatory element or from a modified variant of its own 
promoter. Heterologous regulatory elements include 

30 constitutive regulatory elements, which result in 

expression of the AGL8"like gene product in the outer 
repliim as well as in a variety, of other cell types, and 
dehiscence zone-selective regulatory elements, which 
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produce selective expression of an AGL8-like gene product 
in a limited number of cell types including the cells of 
the valve margin or the dehiscence zone. 

» 

Ectopic expression of a nucleic acid molecule 
5 encoding an AGL8-like. gene product can be achieved using 
an endogenous or exogenous nucleic acid molecule encoding 
an AGL8-like gene product. A recombinant exogenous 
nucleic acid molecule can contain a heterologous 
regulatory element that is operatively linked to a 

10 nucleic acid sequence encoding an AGL8~like gene product. 
Methods for producing the desired recombinant nucleic . 
acid molecule under control of a heterologous regulatory 
element and for producing a non-naturally occurring seed 
plant of the invention are well known in the art (see^ 

15 generally, Sainbrook et al., supra, 1989; Glick and 
Thompson, supra, 1993) . 

An exogenous nucleic acid molecule can be 
introduced into a seed plant for ectopic expression using 
a variety of transformation methodologies including 

20 AgroJbacterium-mediated transformation and direct gene 
transfer methods such as electroporation and 
microprojectile-mediated transformation (see, generally, 
Wang et al. (eds). Transfor mation of Plants and Soil 
Microorganisms, Cambridge, UK: University Press (1995); 

25 which is incorporated herein by reference) • 

Transformation methods based upon the soil bacterium 
Agrojbacteriujn tujnefaciens are particularly useful for 
introducing an exogenous nucleic acid molecule into a 
seed plant. The wild type form of Agrobacterium contains 

30 a Ti (tumor-inducing) plasmid that directs production of 
tumorigenic crown gall growth on host plants. Transfer 
of the tumor-inducing T-r-DNA region of the Ti plasmid to a 
plant genome requires the Ti plasmid-encoded virulence 
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genes as well as T-DNA borders, which are a set of direct 
DNA repeats that delineate the region to be transferred. 
An AgroJbacteriujm-based vector is a modified form of a Ti 
plasmid, in which the tumor inducing functions are. 
5 replaced by the nucleic acid sequence of interest. to be 
introduced into the plant host. 

Agrojbacteriuin-mediated transformation generally 
employs cointegrate. vectors or, preferably, binary vector 
systems, in which the components of the Ti plasmid are 

10 divided between a helper vector, which resides 

permanently in the' Agrobacterium host and carries the • 
virulence genes, and a shuttle vector, which contains the 
gene of interest bounded by T-DNA sequences. A variety 
of binary vectors are well known in the. art and are 

15 commercially available, for example, from Clontech (Palo 
Alto, CA) . . Methods of coculturing Agrojbacteriujn with 
cultured plant cells or wounded tissue such as leaf 
tissue, root explants, hypocotyledons, stem pieces or 
tubers, for example, also are well known in the art 

20 (Click and Thompson, supra ^ 1993) . Wounded cells within 
the plant tissue that have been infected by Agrobacterium 
can develop organs de novo when cultured under the 
appropriate conditions; the resulting transgenic shoots 
eventually give rise to transgenic plants that 

25 ectopically express a nucleic acid molecule encoding an 
AGL8-like gene product. Agrobacterium also can be used 
for transformation of whole seed plants as described in 
Bechtold et al., C.R. Aca d. Sci. Paris. Life Sci. 
316:1194-1199 (1993), which is incorporated herein by 

30 reference) . Agrojbacterium-mediated transformation is 
useful for producing a variety of transgenic seed plants 
(Wang et al., supra^ 1995) including transgenic plants of 
the Brassicaceae family> such as rapeseed, Arabidopsis, 
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mustard, and flax, and transgenic plants of the Fabaceae 
. family such as soybean, pea, lentil and bean. 

Microprojectile-mediated transformation also 
can be used to produce a transgenic seed plant that 
5 ectopically expresses an AGL8-like gene product. This 
method, first described by Klein et al. ( Nature 327:70-73 
(1987), which is incorporated herein by reference), 
relies on microprojectiles such as gold or tungsten that 
are coated with the desired nucleic acid molecule by 
10 precipitation with calcium chloride, spermidine or PEG. 
The microprojectile particles are accelerated at high 
speed into an angiosperm tissue using a device such as 
the BIOLISTIC PD-1000 (Biorad; Hercules CA) . 

Microprojectile-mediated delivery or "particle 

* 

15 bombardment" is especially useful to transform seed 
plants that are difficult to transform or regenerate 
using other methods. Microprojectile-mediated 
transformation has been used, for example, to generate a . 
variety of transgenic plant species, including cotton,. 

20 tobacco, corn, hybrid poplar and papaya (see Click and 
Thompson, supra, 1993) as well as cereal crops such as 
wheat, . oat, barley, sorghum and rice (Duan et al.. Nature . 
Biotech, 14:494-498 (1996); Shimamoto, Curr, Qpin, 
Biotech. 5:158-162 (1994), each of which is incorporated 

25 herein by reference) . In view of the above, the skilled 
artisan will recognize that Agrojbacterium-mediated or 
microprojectile-mediated transformation, as disclosed 
herein, or other methods known in the art can he used to 
introduce a nucleic acid molecule encoding an AGL8-like 

30 gene product into a seed plant for ectopic expression • 

In another embodiment, the invention provides a 
non-naturally occurring seed plant that is characterized 
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. by delayed seed dispersal due to suppression of both AGLl 
expression and AGL5 expressiion in the seed plant. Such a 
non-naturally occurring seed plant characterized by 
delayed seed dispersal can be, for example, an agll agl5 
5 double mutant'. 

* * 

• ■ * 

As disclosed herein, loss-of-f unction mutations 
in the AGLl and AGL5 genes were produced by a combination 
of homologous recombination and disruptive T-DNA 
insertion (see Example II) . Neither AGLl nor AGL5 RNA 

10 was expressed in the resulting agll agl5 double mutant, 

and scanning electron microscopy revealed that the • 
dehiscence zone failed to develop normally in these 
mutant seed plants. Furthermore, the mature fruits of 
these seed plants failed to undergo dehiscence, as shown 

15 in Figure 5. These results indicate that AGLl or AGL5 

gene expression is required for normal development of the 
dehiscence zone and that suppression of AGLl expression 
combined with suppression of AGL5 expression in the seed 
plant can delay dehiscence, allowing the process of pod. 

20 shatter to be controlled. 

The Arabidopsis AGLl and AGL5 genes encode MADS 
box proteins with 85% identity at the amino acid level 
(see Tables 1 and 2). The AGLl and AGL5 RNA expression 
patterns also are strikingly similar. . In particular, 
25 both RNAs are specifically expressed in flowers, where 
they accumulate in developing carpels. In particular, 
strong expression of these genes is observed in the outer 
replum along the valve/replum boundary (Ma et al., supra/ 
1991; Savidge et al.. The Plant Cg^n 7:721-723 (1995); 

* 

30 Flanagan et al.. The Plant Journal 10:343-353 (1996), 
each of which is incorporated herein by reference) . 
Thus, AGLl and AGL5 are expressed in the valve margin, at 
least within the cells of the outer replum. 
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Table 1 

Amino acid identity in the MADS domain and K~domain of 

• 

. AGAMOUS, AGLl and AGL5 




AGAMOUS 


AGLl 


AGL5 * 




MADS 


K 


MADS 


K 


MADS 


K 


AGAMOUS 






95% 


68% 


95% 


62% 


AGLl 










100%' 


92% 


AGL5 


1 













Table 2 

Amino acid identity in the I-domain and C-domain of 

' AGAMOUS, AGLl and AGL5 




AGAMOUS 


AGLl 


AGL5 




I 


C 


I 


C 


I 


C 


AG7\M0US 














AGLl 


71% 


39% 










AGL5 


65% 


37% 


95% 


72% 







As used herein, the term "AGLl" refers to 
15 Arabidopsis AGLl (SEQ ID NO: 6) or an ortholog of 

Arabidopsis AGLl (SEQ ID NO: 6). An AGLl ortholog is a 
MADS box gene product expressed, at least in part, in the 
valve margins of a seed plant and having homology to the 
amino acid sequence of Arabidopsis AGLl (SEQ ID NO: 6). 
20 AGLl or an AGLl ortholog can function, in part, by 

forming a complex with an AGL8-like gene product. An 
AGLl ortholog generally has an amino acid sequence having 
at least about 63% amino acid identity with Arabidopsis 
AGLl (SEQ ID NO: 6) and includes polypeptides having 
25 greater than about 70%, 75%, 85% or 95% amino acid 

identity with Arabidopsis AGLl (SEQ ID NO: 6). Given the 
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. . close relatedness of the AGLl and AGL5 gene products, one 
skilled in the art will recognize that an AGLl ortholog 
can be distinguished from an AGL5 ortholog by being more 

closely related to Arahidopsis AGLl (SEQ ID NO: 6) than to 

» 

5 Arabidopsis AGL5 (SEQ ID N0:.8) . An AGLl ortholog. can 
function in wild type. plants, like Arabidopsis AGLl, to 
limit the domain of AGL8-like gene product expression to 
the carpel valves during the later stages of floral 
development. 

10 . As used herein, the term "AGL5" refers to 

Arabidopsis AGL5 (SEQ ID NO: 8) or to an ortholog of • 
Arabidopsis AGL5 (SEQ ID NO: 8). An AGL5 ortholog is a 
MADS box gene product expressed, at least in part, in the 
valve margins of a seed plant and having homology to the 

15 amino acid sequence of Arabidopsis AGL5 (SEQ IDN0:8). 
AGL5 or an AGL5 ortholog can function, in part, by 
forming a complex with an AGL8-like gene product as shown 
in Example IV. An AGL5 ortholog generally has an amino 
acid sequence having at least about 60% amino acid 

20 identity with Arabidopsis AGL5 (SEQ ID NO: 8) and includes 
polypeptides having greater than about 65%, 70%, 75%, 85% 
or 95% amino acid identity with Arabidopsis AGL5 (SEQ ID 
NO: 8). Given the close relatedness of the AGLl and AGL5 
gene products, one skilled in the art will recognize that 

25 an AGL5 ortholog can be. distinguished from an AGLl 

ortholog by being more closely related to Arabidopsis 
AGL5 (SEQ ID NO: 8) than to Arabidopsis AGLl (SEQ ID 
NO: 6). An AGL5 ortholog can function in wild type 
plants, like Arabidopsis AGL5, to limit the domain of 

30 AGL8-like gene product expression to the carpel valves 
during the later stages of floral development. 

The term "suppressed," as used herein in 
reference to AGLl expression, means that the amount of 
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functional AGLl protein is reduced in a seed plant in 
. comparison with the amount of functional AGLl protein in 
the corresponding wild type seed plant. Similarly, when 
used in reference to AGL5 expression, the term suppressed. 
5 means that the amount of functional AGL5 protein is 

reduced in a seed plant in comparison with the amount of 
functional AGL5 protein in the corresponding wild type 
seed plant. Thus, the term "suppressed," as ; used herein, 
encompasses the absence of AGLl or AGL5 protein in a. seed 

10 plant, as well as protein expression that is present but 
reduced as compared to the level of AGLl or AGL5 protein.- 
expression in a wild type seed plant. Furthermore, the 
term suppressed refers to AGLl or AGL5 protein expression 
that is reduced throughout the entire domain of AGLl or 

15 AGL5 expression, or to expression that is reduced in some 
part of the AGLl or AGL5 expression domain, provided that 
the resulting seed plant is characterized by delayed seed 
dispersal. 

As used herein, the term "suppressed" also 
20 encompasses an amount of AGLl or AGL5 protein that is 
equivalent to wild type AGLl or AGL5 expression, but 
where the ,AGL1 or AGL5 protein has a reduced level of 
activity. As discussed above, AGLl and AGL5 each contain . 
a conserved ^4ADS domain; point mutations or gross 
25 deletions within the MADS domain that reduce the 
DNA-binding activity of AGLl or AGL5 can reduce or 
destroy the activity of AGLl or AGL5 and, therefore, 
"suppress" AGLl or AGL5 expression as defined herein. 
One skilled in the art. will recognize that, preferably, 
30 AGLl expression is essentially absent in the valve margin 
of a seed plant or the AGLl protein is essentially 
non-functional and, similarly, that, preferably, AGL5 
expression is essentially absent in the valve margin of 
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the seed plant or the. AGL5 protein is essentially 
. non-functional. 

A variety of methodologies can be used to 
suppress AGLl or AGL5 expression in a seed plant. 
5 Suppression can be achieved by directly modifying the 
AGLl or AGL5. genomic locus, for example, by modifying an 
AGLl or AGL5 regulatory sequence such that transcription 
or translation from the AGLl or AGL5 locus is reduced, or 
by modifying an AGLl or AGL5 coding sequence such that 

10 non-functional AGLl or AGL5 protein is produced. 

Suppression of AGLl or .AGL5 expression in. a seed plant 
also can be achieved indirectly, for example, by 
modifying the expression or activity of a protein that 
regulates AGLl or AGL5 expression. Methodologies for 

15 effecting suppression of AGLl or AGL5 expression in a 
seed plant, include, for example, homologous 
recombination, chemical and transposon-mediated 
mutagenesis, cosuppression and antisense-based techniques 
and dominant negative methodologies. 

20 Homologous recombination of AGLl or AGL5 can be 

used to suppress AGLl or AGL5 expression in a seed plant 
as described in Kempin et al.. Nature 389:802-803 (1997), 
which is incorporated herein by reference. Homologous 
recombination can be.used, for example, to replace the 

25 wild type AGL5 genomic sequence with a construct in which 
the gene for kanamycin resistance is flanked by at lea$t 
about 1 kb of AGL5 sequence. The use of homologous 
recombination to suppress AGL5 expression is set forth in 
Example II. 

30 Suppression of AGLl or AGL5 expression also can 

be achieved by producing a loss-of-function mutation 
using transposon-mediated insertional mutagenesis with bs 
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transposons or "Stm transposons (see, for example, 
Sundaresan et al.. Genes Devel, 9 : 1797-1810 (1995) , which 
is incorporated herein by reference) . Insertion of a 
transposon into an AGLl or AGL5 target gene can be. 
5 identified, for example, by restriction mapping, which 
can identify the presence of an insertion in the gene 
promoter or in the coding region, such that expression of 
functional gene product is suppressed. Insertion of a 
transposon also can be identified by detecting an absence 

10 of the mRNA encoded by the target gene or by the 
detecting the absence of the gene, product in valve 
margin. Suppression of AGLl or AGL5 expression also can , 
be achieved by producing a loss-of-f unction mutation 
using T-DNA-mediated insertional mutagenesis (see Krysan 

15 et al., Proc. Natl. Acad. Sci., USA 93:8145-8150 (1996)). 
The use of T-DNA-mediated insertional mutagenesis to 
suppress AGLl expression is disclosed in Example II. 

Suppression of AGLl or AGL5 expression in a 
seed plant also can be achieved using cosuppression, 

20 which is a well known methodology that relies on 
expression of a nucleic acid molecule in the sense 
orientation to produce coordinate silencing of the 
introduced nucleic acid molecule and the homologous 
endogenous gene (see, for example, Flavell, Proc. Natl. 

25 Acad. Sci.. USA 91:3490-3496 (1994); Kopter and Mol, 
Current Qpin. Biol. 4:166-171 (1993), each of which is 
incorporated herein by reference) . Cosuppression is 
induced most strongly by a large number of transgene 
copies or by overexpression of transgene RNA and can be : 

30 enhanced by modification of the transgene such that it 
fails to be translated. 

Antisense nucleic acid molecules encoding AGLl 
and AGL5 gene products, or fragments thereof, also can be 



wo 99/00502 PCTAJS98/13208 

, 45' 

used to suppress expression of AGLl and AGL5 in a seed 
, plant. Antisense nucleic acid molecules reduce mRNA 
translation or increase mRNA degradation, thereby 
suppressing gene expression . (see, for example, Kooter and. 
5- Mol, supra, 1993; Pnueli et al., The Plant Cell Vol. 6, 
175-186 (1994), which is incorporated herein by 

• • - 

reference) . . 

To produce a non-naturally occurring, seed, 
plant of the invention, in which AGLl and AGL5 expression 

10 each are suppressed, the one or more sense or antisense 
nucleic acid molecules can be expressed under control of 
a strong regulatory element that is expressed, at least 
in part, in the valve margin of the seed plant. The 
constitutive CaMV 35S promoter (Odell et al., 

15 supra, 1985), for example, or other constitutive 

promoters as disclosed herein, can be useful in the 
methods of the invention.. Dehiscence zone-selective 
regulatory elements also can be useful for expressing one 
or more sense or antisense nucleic acid molecules in 

20 order to suppress AGLl and AGL5 expression in a seed 
plant 

The skilled artisan Will recognize that 
effective suppression of endogenous AGLl and AGL5 gene 
expression depends upon the one or more introduced 

25 nucleic acid molecules having a high percentage of 
homology with the corresponding endogenous gene loci. 
Nucleic acid molecules encoding Arabidopsis AGLl (SEQ ID 
N0:5) and AGL5 (SEQ ID N0:7) are provided herein (see, 
also. Ma et al., supra, 1991). Nucleic acid molecules 

30 encoding Arabidopsis AGLl and AGL5 can be useful in the 
methods, of the invention or for isolating orthologous 
AGLl and AGL5 sequences. 
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The Homology requirement for effective 
suppression using homologous recombination, cosuppression 
or antisense methodology can be determined empirically. • 
In general, a minimum of about 80-90% nucleic acid 
5 sequence identity is preferred for effective suppression 
of AGLl or AGL5 expression. Thus, a nucleic acid 
molecule encoding a gene ortholog from the family or 
genus of the seed plant species into which the nucleic 
acid molecule is to be introduced is preferred for 

10 generating the non-naturally occurring seed plants of the 
invention using homologous recombination, cosuppression 
or antisense technology. More preferably, a nucleic acid . 
molecule encoding a gene ortholog from the same seed 
plant species is used for suppressing AGLl expression and 

15 AGL5 expression in a seed plant of the invention. For 
example, nucleic acid molecules encoding canola AGLl and 
AGL5 are preferable for suppressing AGLl and AGL5 
expression in a cahola plant. 

Although use of a highly homologous nucleic 
20 acid molecule is preferred in the methods of the 

invention, the nucleic acid molecule to be used for 
homologous recombination, cosuppression or antisense 
suppression need not contain in its entirety the AGLl or 
AGL5 sequence to be suppressed. Thus, a sense or 
25 antisense nucleic acid molecule encoding only a portion 
of Arabidopsis AGLl (SEQ ID NO: 5), for example, or a 
sense or antisense nucleic acid molecule encoding only a 
portion of Arabidopsis AGL5 (SEQ ID NO: 7) can be useful 
for producing a non-naturally occurring seed plant of the 
30 invention, in which AGLl and AGL5 expression each are 
suppressed. 

A portion of a nucleic acid molecule to be 
homologously recombined with an AGLl or AGL5 locus 
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generally contains at least about 1 kb of sequence 
. homologous to the targeted gene and preferably contains 
at least about 2 kb, more preferably at least about 3 kb 
and can contain at least about 5 kb of sequence. 
5 homologous to the targeted gene, A portion of a nucleic, 
acid molecule encoding an AGLl or AGL5 to be used for 
cosuppression or antisense suppression generally contains 
at least about 50 base pairs to the full-length of the 
nucleic acid molecule .encoding the AGLl or AGL5 prtholog. 
10 In contrast to an active segment, as defined herein, a 
portion of .a nucleic acid molecule to be used for 
homologous recombination, cosuppression or antisense 
suppression need not encode a functional part of a gene 
product. 

15 A dominant negative construct also can be used 

to suppress AGLl or AGL5 expression in a seed plant. A 
dominant negative construct useful in the invention 
generally contains a portion of the complete AGLl or AGL5 
coding sequence sufficient, for example, for DNA-binding . 

20 or for a protein-protein interaction such as a 

homodimeric or heterodimeric protein-protein interaction 
but lacking the transcriptional activity of the wild type 
protein. For example, a carboxy- terminal deletion mutant 
of AGAMOUS was used as a dominant negative construct to 

25 suppress expression of the MADS box gene AGAMOUS 

(Mizukami et al., £lsilL-C£li 8:831-844 (1996), which is 
incorporated by reference herein) . One skilled in, the 
art understands that, similarly, a dominant negative AGLl 
or AGL5 construct can be used to suppress AGLl or AGL5 

30 expression in a seed plant. A useful dominant negative 
construct can be a deletion mutant encoding, for example, 
the MADS box domain alone ("M"), the MADS box domain and 
"intervening" region ("MI"); the MADS box, "intervening" 
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. and "K" domains ("MIK"); or the "intervening," "K" and 

carboxy-terminal domains ("IKC") , 

In a preferred embodiment, a non-naturally . 
occurring seed plant of the invention is an agll agl5 
5 double mutant. An agll agl5 double mutant is a 

particularly useful non-naturally occurring seed plant 
that is characterized by delayed seed dispersal. 

As used herein, the term "agll agl5 double 
mutant" means a seed plant having a loss-of-function 

10 mutation at the AGLl locus and a loss-of-f unction < 
mutation at the AGL5 locus. Loss-of-f unction mutations 
encompass point mutations, including substitutions, 
deletions and insertions, as well as gross modifications 
of an AGLl and AGL5 locus and can be located in ' coding or 

15 non-coding sequences. One skilled in the art understands . 
that any such loss-of-f unction mutation at the AGLl locus 
can be combined with any such mutation at the AGL5 locus 
to. generate an agll agl5 double mutant of the invention. 
Production of an exemplary agll agl5 double mutant in the 

20 Brassica. seed plant Arabidopsis is disclosed herein in 
Example II. 

AGLl and AGL5 are closely related genes that 
have diverged relatively recently. While, not wishing to 
be bound by the following, some plants can contain only 

25 AGLl or only AGX5, or can contain a single ancestral gene 
related to AGLl and AGL5. In such plants, a seed plant 
characterized by delayed seed dispersal can be produced 
by suppressing only expression of AGLl, or expression of 
AGL5, or expression of a single ancestral gene related to 

30 AGLl and AGL5. Thus, the present invention provides a 
non-naturally occurring seed plant characterized by 
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delayed seed dispersal, in which AGLl expression is 
. suppressed. Such a noh-haturally occurring seed, plant 
characterized by delayed seed dispersal can be, for 
example, an agll single mutant. The present invention 
5 also provides a non-naturally occurring seed plant 

characterized by delayed, seed dispersal, in which AGL5 
expression is suppressed. A non-naturally occurring seed 
plant characterized by delayed seed dispersal in which 

* 

AGL5 expression is suppressed can be, for example, an 
10 agl5 single mutant. 

The present invention further provides tissues 
derived from non-naturally occurring seed plants of the 
invention. In one embodiment, the invention provides a 
tissue derived from a non-naturally occurring seed plant 

15 that has an ectopically expressed nucleic acid molecule 
encoding an AGL8-like gene product and is characterized 
by delayed seed dispersal. In another embodiment, the 
invention provides a tissue derived from a non-naturally 
occurring seed plant in which AGLl expression and AGL5 

20 expression each are suppressed, where the seed plant is 
characterized by delayed seed dispersal. 

As used herein, the term ^'tissue" means an 
aggregate of seed plant cells and intercellular material 
organized into a structural and functional unit. A 
25 particular useful tissue of the invention is a tissue 
that can be vegetatively or non-vegetatively propagated 
such that the seed plant from which the tissue was 
derived is reproduced. A tissue of the invention can be, 
for example, a seed, leaf, root or part thereof. 

30 As used herein, the term "seed" means a 

structure formed by the maturation of the ovule of a seed 
plant following fertilization. Such seeds can be readily 
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. harvested from non-naturally occurring seed plant of 
the invention characterized by delayed seed dispersal. 

A seed plant characterized by enhanced seed 
dispersal also can be produced by manipulating expression 
5 of an AGL8-like gene product or AGLl or AGL5- 

Suppression of AGL8-like gene product expression in a 
seed plant, for example, suppression of AGL8-like gene 
product expression in valve tissue, can be used to 
produce a seed plant characterized by enhanced seed 
10 dispersal. Ectopic expression of AGLl or AGL5, or both, 

in a seed plant, for example, premature expression of , 
AGLl or AGL5, also can be used to produce a non-naturally 
occurring seed plant of the invention characterized by 
enhanced seed dispersal. The skilled person understands 
. 15 that these or other strategies of manipulating AGL8, AGLl 
or AGL5 expression can be used to produce a non-naturally 
occurring seed plant characterized by enhanced seed 
dispersal. 

The invention also provides a substantially 
20 purified dehiscence zone-selective regulatory element, 
which includes a nucleotide sequence that confers 
selective expression upon an operatively linked nucleic 
acid molecule in the valve margin or dehiscence zone of a 
seed plant, provided that the dehiscence zone-selective 
25 regulatory element does not have a nucleotide sequence 
consisting of nucleotides 1889 to 2703 of SEQ ID N0:4. 

As used herein, the term "dehiscence 
zone-selective regulatory element" refers to a nucleotide 
sequence that, when operatively linked to a nucleic acid 
30 molecule, confers selective expression upon the 

operatively linked nucleic acid molecule in a limited 
number of plant tissues, including the valve margin or 
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dehiscence zone. As discussed above, the valve margin is 
, the future site of the dehiscence zone and encompasses 
the margins of the outer replum as well as valve cells 
adjacent to the outer replum* The dehiscence zone, which 
5 develops in the region of the valve margin, refers to the 
group of. cells that separate during the process of 
dehiscence, allowing valves to come apart/from the replum 
and the enclosed seeds to be released. Thus, a 
dehiscence zone-selective regulatory element, as defined 
10 herein, confers selective expression in the mature 

dehiscence zone,, or confers selective expression in the 
valve margin, which marks the future site of the 

■ 

dehiscence zone. 

A dehiscence zone-selective regulatory element 

15 can confer specific expression exclusively in cells of 
the valve margin or dehiscence zone or can confer 
selective expression in a limited number of plant cell 
types including cells of the valve margin or dehiscence 
zone. An AGL5 regulatory element, for example, which 

20 confers selective expression in ovules and placenta as 
well as in the dehiscence zone, is a dehiscence 
zone-selective regulatory element as defined herein. A 
dehiscence zone-selective regulatory element generally is 
distinguished from other regulatory elements by 

25 conferring selective expression in the valve margin or 
dehiscence zone without conferring expression throughout 
the adjacent carpel valves. 

' The Arabidopsis AGLl gene (SEQ ID NO: 3) is 
shown in Figure 7, with the intron-exon boundaries 

30 indicated. The Arabidopsis AGL5 gene (SEQ ID NO: 4) is 
shown in Figure 8, with the intron-exon boundaries 
indicated. An AGLl or AGL5 regulatory element, such as a 
5* regulatory element or intronic regulatory element, can 
confer selective expression in the valve margin or 
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dehiscence zone and, thus, is a dehiscence-zone selective 
regulatory element as defined herein. The AGL5 gene, for 
example, is selectively expressed in the dehiscence zone, 
placenta and ovules, and an AGL5 regulatory element can 
5 confer selective expression in the dehiscence zone, 
placenta and ovules upon an operatively linked nucleic 
acid molecule. 

The invention provides a dehiscence 
10 zone-selective regulatory element that is an AGLl or AGL5 
regulatory element. Such a dehiscence zone-selective 
regulatory element can be, for example, an AGLl • 
regulatory element. An AGLl regulatory element can have, 
for example, the nucleotide sequence of a non-coding 
15 portion of the Arahidopsxs AGLl gfenomic • sequence 

identified as SEQ ID NO: 3. A dehiscence zone-selective 
regulatory element also can be, for example, an AGL5 
regulatory element. An AGL5 regulatory element can haye, 
for example, the nucleotide sequence of a non-coding 
20 portion of the Arabidopsis AGL5 genomic sequence 

identified as SEQ. ID NO: 4, provided that the regulatory 
element does not have a nucleotide sequence consisting of 
nucleotides 1889 to 2703 of SEQ ID NO: 4. 

As used herein, the term "substantially the . 

25 nucleotide sequence, " when used in reference to an AGLl 
or AGL5 regulatory element, means a nucleotide sequence 
having an identical sequence, or a nucleotide sequence 
having a similar, non-identical sequence that is 
considered to be a functionally equivalent sequence by 

30 those skilled in the art. For example, a dehiscence 
zone-selective regulatory element that, is an AGLl 
regulatory element can have, for example, a nucleotide 
sequence identical to the sequence of the Arabidopsis 
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AGLl regulatory element having nucleotides 1 to 2599 of 
• SEQ ID NO: 3 shown in Figure 1, or a similar, 

non-identical, sequence that is functionally equivalent-. 
A dehiscence zone-selective regulatory element can have, 
5 for example, one or more modifications such as nucleotide 
additions, deletions or substitutions relative to the 
nucleotide sequence shown in Figure 8, provided that the 
modified nucleotide sequence retains substantially the 
. ability to confer selective expression in the valve 
10 margin or dehiscence zone upon an operatively linked 
nucleic acid molecule. 

It is understood that limited modifications can 
be made without destroying the biological function of an 
AGLl or AGIi5 regulatory element and that such limited 

15 modifications can result in dehiscence zone-selective 

regulatory elements that have substantially equivalent or 
enhanced function as compared to a wild type AGLl or AGL5 
. regulatory element. These modifications can be 
deliberate, as through site-directed mutagenesis, or can 

20 be accidental such as through mutation in hosts harboring 
the regulatory element. All such modified nucleotide 
sequences are included in the definition of a dehiscence 
zone-selective regulatory element as long as the ability 
to confer selective expression in the valve margin or 

25 dehiscence zone. is substantially retained. 

A dehiscence zone-selective regulatory element 
can be derived from a gene that is an ortholog of 
Arahidopsis AGLl or AGL5 and is selectively expressed in 
the valve margin or dehiscence zone of a seed plant. A 
30 dehiscence zone-selective regulatory element can be 

derived, for example, from an AGLl or AGL5 ortholog of 
the Brassicaceae, such as a Brassica napus, Brassica 
oleracea, Brassica campestris, Brassica juncea, Brassica 
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nigra or Brassica carinata AGLl or AGL5 ortholog, A 
dehiscence zone-selective regulatory element can be . 
derived, for example, from an AGLl or AGL5 eanola 
ortholog. A dehiscence zone-selective regulatory element 
5 also can be derived, for example, from a leguminous AGLl 
or AGL5 ortholog, such as a soybean, pea, chickpea, moth 
bean, broad bean, kidney bean, lima bean, lentil, cpwpea, 
dry bean, peanut, alfalfa, lucerne, birdsfoot trefoil, 
clover, stylosanthesr lotononis bainessii, or sainfoin 
10 AGLl or AGL5 ortholog. 

* 

■ * 

Dehiscence zone-selective regulatory elements ' 
also can.be derived from a variety of other genes that 
are selectively expressed in the valve margin or 
dehiscence zone of a seed plant. For example, the 

15 rapeseed gene RDPGl is selectively expressed in the 
dehiscence zone (Petersen et al., Plant Mol. 
QlQl^ 31:517-527 (1996), which is incorporated herein by 
reference) . Thus, the RDPGl promoter or an active 
fragment thereof can be a dehiscence zone-selective 

20 regulatory element as defined herein. Additional genes 
such as the rapeseed gene SAC51 also are known to be 
selectively expressed in the dehiscence zone; the.SAC51 
promoter or. an active fragment thereof also can be a 
dehiscence zone-selective regulatory element of the. 

25 invention (Coupe et al.. Plant Mol, Biol, 23:1223-1232 ^ 
(1993), which is incorporated herein by reference). 
Further, genes selectively expressed in the dehiscence 
zone include the gene that confers selective GUS 
expression in the Arabidopsis transposant line GT140 

30 (Sundaresan et al.. Genes Devel. 9:1797-1810 (1995), 

which is incorporated herein by reference). The skilled 
artisan understands that a regulatory element of any such 
gene selectively expressed in cells of the valve margin 
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or dehiscence zone can be a dehiscence zone-selective 
. regulatory element as defined herein. 

Additional dehiscence zone-selective regulatory, 
elements can be identified and isolated using routine 
5 methodology. Differential screening strategies using, 
for example, RNA prepared from the dehiscence zone and 
RNA prepared from adjacent pod material can be used to 
isolate cDNAs selectively expressed in cells of the . 
dehiscence zone (Coupe et al., supra, 1993); 
10 subsequently, the corresponding genes are isolated using, 
the cDNA sequence as a probe. 

Enhancer trap or gene trap strategies also can 
be used to identify and isolate a dehiscence 
zone-seilective regulatory element of the invention 

■ 

15 (Sundaresan et al., supra, 1995; Koncz et al., Proc. 

Natl. Ac ad. Sci. USA 86:8467-8471 (1989); Kertbundit et 
al., Proc. Na tl. Acad. Sci. USA 88:5212-5216 (1991); 
Topping et al.. Development 112:1009-1019 (1991), each of 
which is incorporated herein by reference) . Enhancer 

20 trap elements include a reporter gene such as GUiS with a 
weak or minimal promoter, while gene trap elements lack a 
promoter sequence, relying on transcription from a 
flanking chromosomal gene for reporter gene expression. 
Transposable elements included in the constructs mediate 

25 fusions to endogenous loci; constructs selectively 
expressed in the valve margin or dehiscence zone are 
identified by their pattern of expression. With the 
inserted element as a tag, the flanking dehiscence 
zone-selective regulatory element is cloned using, for 

30 example, inverse polymerase chain reaction methodology 
(see, for example, Aarts et al.. Nature 363:715-717 
. (1993); see, also, Ochman et al., "Amplification of 
Flanking Sequences by Inverse PGR," in Innis et al.. 
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supra, 1990). The Ac/Ds transposition system of 
Sundaresan et al., supra, 1995, can be particularly, 
useful in identifying and isolating a dehiscence 
zone-selective regulatory element of the invention. . 

♦ . * 

• • • • 

5 Dehiscence zone-selective regulatory elements 

also can be isolated by inserting a library of random 
genomic DNA fragments in front of a promoterless reporter 
gene and screening transgenic seed plants transformed 
with the library for dehiscence zone-selective reporter 

10 . gene expression. The promoterless vector pROA97, which 
contains the npt gene and the GUS gene each under the 
■control of the minimal 35S promoter, can be useful for 
such screening. The genomic library can be, for example, 
Sau3A fragments of Arabidopsis thaliana genomic DNA or 

15 genomic DNA from, for example, another Brassicaceae of 
interest (Ott et al., Mol. Gen, Genet. 223:169-179 
(1990); Claes et al.. The Plant Journal 1:15-26 (1991), 
each of. which is incorporated herein by reference). 

.ft • - 

Dehiscence zone-selective expression of a 
20 regulatory element of the invention can be demonstrated 
or confirmed by routine techniques, for example, using a 
reporter gene and in situ expression analysis. The GUS 
and firefly luciferase reporters are particularly useful 
for in situ localization of plant gene expression 
25 (Jefferson et al., EMBO J. 6:3901 (1987); Ow et al.. 
Science 334:856 (1986), each of which is incorporated 
herein by reference) , and promoterless vectors containing 
the GUS expression cassette are commercially available, 
for example, from Clontech (Palo Alto, CA) . To identify 
30 a dehiscience zone-selective regulatory element of 

interest such as an AGLl or AGL5 regulatory element, one 
or more nucleotide portions of the AGLl or AGL5 gene can 
be generated using enzymatic or PCR-based methodology 
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(Click and Thompson, supra, 1993; Innis et al., supra, 
' 1990) ; the resulting segments are fused to a reporter 
gene such as GUS and analyzed as described above. 

The present invention also provides a 
5 substantially purified dehiscence zone-selective 

♦ 

regulatory element that confers selective expression upon 
an operatively linked nucleic acid molecule ;in the valve 
margin or dehiscence zone of a seed plant, where. the 
element is an AGLl regulatory element having at least 

10 fifteen contiguous nucleotides. of one df the following . 
nucleotide sequences: . nucleotides 1 to 2599 of SEQ ID 
NO: 3; nucleotides 2833 to 4128 of SEQ ID N0:3; 
nucleotides 4211 to 4363 of SEQ ID NO: 3; nucleotides 4426 
to 4554 of SEQ ID NO: 3; nucleotides 4655 to 4753; 

15 nucleotides 4796 to 4878 of SEQ ID NO: 3; nucleotides 4921 
to 5028 of SEQ ID NO: 3; or nucleotides 5361 to 5622 of 
SEQ ID NO: 3. A substantially purified dehiscence 
zone-selective regulatory element that is an AGLl 
regulatory element can have, for example, at least 16, 

20 18, 20, 25, 30, 40, 50, 100 or 500 contiguous nucleotides 
of one of the portions of SEQ ID NO: 3 described above. 

The present invention also provides a 
substantially purified dehiscence zone-selective 
regulatory element that confers selective expression upon 

25 an operatively linked nucleic acid molecule in the valve 
margin or dehiscence zone of a seed plant, where the 
element is an AGL5 regulatory element having at least 
fifteen contiguous nucleotides of one of the following 
nucleotide sequences: nucleotides 1 to 1888 of SEQ ID 

30 NO: 4; nucleotides 2928 to 5002 of SEQ ID NO: 4; 

nucleotides 5085 to 5204 of SEQ ID NO: 4; nucleotides 5367 
to 5453 of SEQ ID N0:4; nucleotides 5496 to 5602; . 
nucleotides 5645 to 5734 of SEQ ID N0:4; or nucleotides 
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6062 to 6138 of SEQ ID NO: 4. A substantially purified 
dehiscence zone-selective regulatory element that is an 
AGL5 regulatory element can have, for example, at least 
16, 18, 20, 25, 30, 40, 50, 100 or 500 contiguous . 
5 nucleotides of one of the portions of SEQ ID NO: 4 

described above. 

■ 

A proximal fragment of the Arabidopsis AGL5 
promoter has been described (Savidge et al-. The Plant 
Cell 7:721-733 (1995)). However, this fragment (shown as 
10 nucleotides 1889 to 2703 in Figure 8) lacks many of the 

distal regulatory elements contained in the entire • 
Arabidopsis AGL5 genomic sequence disclosed herein (SEQ 
ID N0:4). The present invention provides approximately 
2.7 kb of Arabidopsis AGL5 5' flanking sequence, 

15 including the variety of regulatory elements contained 

I'. 

therein. The disclosed Arabidopsis AGL5 5* flanking 
sequence contains a larger complement of regulatory 
elements involved in regulating expression of the 
endogenous AGL5 gene in vivo and, therefore, can be 
20 particularly useful for dehiscence zone-selective 
expression. . 

A nucleotide sequence consisting of the 
promoter proximal region of Arabidopsis AGL5 (nucleotides 
1889 to 2703 of SEQ ID N0:4) is explicitly excluded from 

25 a dehiscence zone-selective regulatory element of the 
invention. However, a dehiscence zone-selective 
regulatory element can include nucleotides 1889 to 2703 
of SEQ ID NO: 4, together with one or more contiguous 
nucleotides, for example, of the nucleotide sequence 

30 shown as positions 1 to 1888 of SEQ ID NO: 4. A 

dehiscence zone-selective regulatory element of the 
invention can have, for example, at least 15 contiguous 
nucleotides of SEQ ID NO: 4, including at least one, two. 
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four, six, ten, twenty or thirty or more contiguous 
nucleotides of the nucleotide sequence shown as positions 
1 to 1888 of SEQ ID NO: 4. 

5 In view of the definition of a dehiscence 

zone-selective regulatory element, it should be- 
. recognized, for example, that a portion of the 
Arahldopsis AGL5 gene having only the sequence shown as 
nucleotides 1889 to 2703 in Figure 8 (SEQ ID N0:4), is 

10 not. a dehiscence zone-selective regulatory element as 
.defined herein. However, a portion of an Arabidopsxs 
AGL5 gene having nucleotides 1885 to 2703 of SEQ ID NO: 4 • 
is considered a dehiscence zone-selective regulatory 
element, provided that the element confers selective 

15 expression upon an operatively linked nucleic acid 

molecule in a limited number of plant tissues, including 
the valve margin or dehiscence zone. Similarly, a 
portion of an Arabidopsis AGL5 gene having a subpart of 
the promoter proximal region of AGL5 also can be a 

20 dehiscence zone-selective regulatory element as defined 
herein, provided that this subpart can confer selective 
expression upon an operatively linked nucleic acid 
molecule in a limited number of plant tissues, including 
the valve margin or dehiscence zone of a seed plant.. 

25 Thus, for example, a regulatory element having the . 

sequence of nucleotides 1889 to 2000 can be a dehiscence 
zone-selective regulatory element of the invention, 
provided that this element confers selective expression 
upon an operatively linked element in the valve margin or 

30 dehiscence zone of a seed plant . 

The present invention also provides a 
recombinant nucleic acid molecule that includes a 
dehiscence zone-selective regulatory element operatively 
linked to a nucleic acid molecule encoding a cytotoxic 
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gene product. Further provided herein is a non-naturally 
. occurring seed plant of the invention that is 
characterized by delayed seed dispersal due to expression 
of a recombinant nucleic acid molecule having a 
5 dehiscence zone-selective regulatory element operatively . 
linked to. a nucleic acid molecule encoding a cytotoxic 
gene product . 

I * 

A cytotoxic gene product is a gene product. that 
causes the death of the cell in which it is expressed 

10 and, preferably, does not result in the death of cells 
other than the cell in which it is expressed. Thus, 
expression of a cytotoxic gene product from a dehiscence 
zone-selective regulatory element can be used to ablate 
the dehiscence zone without disturbing neighboring cells 

15 of the replum or valve. A variety of cytotoxic gene 
products useful in seed plants are known in the art 
including, for example, diphtheria toxin A chain 
polypeptides; RNase Tl; Barnase RNase; ricin toxin A 
chain polypeptides; and herpes simplex virus thymidine 

20 kinase (tk) gene products. While the diphtheria toxin A 
chain, RNase Tl and Barnase RNase are preferred cytotoxic 
gene products, the skilled person recognizes that these, 
or other cytotoxic gene products can be used with a 
dehiscence zone-selective regulatory element to generate 

25 a non-naturally occurring seed plant characterized by 
delayed seed dispersal. 

Diphtheria toxin is the naturally occurring 
toxin of Cornebacterlum diphtheriae, which catalyzes the 
ADP-ribosylation of elongation factor 2, resulting in 
30 inhibition of protein synthesis and consequent cell death 
(Collier, Bacteriol. Rev, 39:54-85 (1975)). . A single 
molecule of the fully active toxin is sufficient to kill 
a cell (Yamaizumi et al.. Cell 15:245-250 (1978)). 
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Diphtheria toxin has two subunits: the diphtheria toxin 
, B chain directs internalization to most eukaryotic cells 
through a specific membrane receptor, whereas the A chain 
•encodes the toxic catalytic domain. The catalytic DT-A 
5 chain does not include a signal peptide and is not 

secreted. Further, any DT-A released from dead cells in 
the absence of the diphtheria toxin B chain is precluded 
from cell attachment. Thus, DT-A is cell autonomous and 
directs killing only of the cells in which it is 

■ * * . - 

10 expressed without apparent damage to neighboring cells. 
The DT-A expression cassette of Palmiter et al., which 
contains the 193 residues of the A chain engineered with 
a synthetic ATG and lacking the native leader sequence, 
is particularly useful in the seed plants of the 

15 invention (Palmiter et al.,. Cell 50:435-443 (1987); 
Greenfield et al., Proc- Na tl. Acad, Sci,, USA 
80:6853-6857 (1983), each of which is incorporated herein 
by reference) .. 

RNase Tl of Aspergillus oryzae and Barnase 
20 RNase of Bacillus amylolique-faciens also are cytotoxic 
gene products useful in the seed plants of the invention 
(Thorsness and- Nasrallah, Methods in Cell Biology 
50:439-448 (1995)). Barnase RNase may be more generally 
toxic to plants than RNase Tl and, thus, is preferred in 
25 the methods of the invention. 

Ricin, a ribosome-inactivating protein produced 
by castor bean seeds r also is a cytotoxic gene product 
useful in a non-naturally occurring seed plant of. the 
invention. The ricin toxin A chain polypeptide can be 
30 used to direct cell-specific ablation as described, for 
example, in Moffat et al.. Development 114:681-687 
(1992) . Plant ribosomes are variably susceptible to the 
plant-derived ricin toxin. The skilled person 
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understands that the toxicity of ricin depends is 
variable and should be assessed for toxicity in the seed 
plant species of interest (see Olsnes and Pihl, Molecular 
Action of Toxin s and Viruses , pages 51-105, Amsterdam: 
5 Elsevier Biomedical Press (1982)). 



Further provided herein is a plant expression 
vector including a dehiscence zone-selective regulatory 
element. A plant expression vector can include, if 
desired, a nucleic acid molecule encoding an AGL8-like 
10 gene product in addition to the dehiscence zone-selective 
regulatory element. 

The term "plant expression vector, " as used 
herein, is a self-replicating nucleic acid molecule that 
provides a means to transfer an exogenous nucleic acid 
15 molecule into a seed plant host cell and to express the 
molecule therein. Plant expression vectors encompass 
vectors suitable for AgroJbacterium -media ted 
transformation, including binary and cointegrating 
vectors, as well as vectors for physical transformation. 

20 Plant expression vectors can be used for 

transient expression of the exogenous nucleic acid 
molecule, or can integrate and stably express the 
exogenous sequence. One skilled in the art understands., 
that a plant expression vector can contain all the 

25 functions needed for transfer and expression of an 

exogenous nucleic acid molecule; alternatively, one or 
more functions can be supplied in trans as in a binary . 
vector system for AgroJbacteriujn-mediated transformation. 

In addition to a dehiscence zone-selective " 
30 regulatory element, a. plant expression vector of the 

invention can contain, if desired, additional elements . 
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A binary vector for Agrobacterium-mediated transformation. 
, contains one or both T-DNA border repeats and can also 
contain, for example, one or more of the following: a 

' ■ I 

broad host range replicon, an ori T for efficient 
5 transfer from E, coli to Agrojbacteriujn, a bacterial 
selectable marker such as ampicillin and a polylinker 
containing multiple cloning sites. 

4 

A plant expression vector for physical 

I 

transformation can have, if desired, a plant selectable 
10 marker in, addition to a dehiscence zone-selective 

regulatory element in vectors such as pBR322, pUC, pGEM 
and M13, which are commercially available, for example, 
from Pharmacia . (Piscataway, NJ) or Promega (Madison, WI) . 
In plant expression vectors for physical transformation 

^ » ' ^ 

15 of a seed plant, the T-DNA borders or the ori T region 
can optionally be included but provide no advantage. 

. The present invention also provides a kit for 
producing a transgenic seed plant characterized by 
delayed seed. dispersal. A kit of the invention contains 
20 a dehiscence zone-selective regulatory element. If 

desired, the dehiscence zone-selective regulatory element 
can be. operatively linked. to a nucleic acid molecule 
encoding an AGL8-like gene product. 

The following examples are intended to 
25 illustrate but not limit the present invention. 
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EXAMPLE I 

PRODUCTION OF A 35S-AGL8 TRANSGENIC ARABIDOPSIS PLANT 
DISPLAYING A COMPLETE LACK OF DEHISCENCE 

This example describes methods for producing a 
5 transgenic Arabidopsis plant lacking normal dehiscence 

due to constitutive AGL8 expression. 

Full-length AGL8 was prepared by polymerase, 
chain reaction amplification using primer AGL8 5-y (SEQ 
ID NO: 9; 5 ' -CCGTCGACGATGGGAAGAGGTAGGGTT-3 ' ) and primer 
10 0AM14 (SEQ ID NO: 10; 5 » -AATCATTACCAAGATATGAA- 3 » ) , and 
subsequently cloned into the Sail and BamHI sites of 
expression vector pBIN-JIT, which was modified from 
pBIN19 to include the tandem CaMV 35S promoter, a 
. polycloning site and the CaMV polyA signal. Arabidopsis 

15 was transformed using the in planta method of 

AgroJbacterium-mediated transformation essentially as 

described in Bechtold et al., C.R. Aca d. Sci. Paris 
316:1194-1199 (1993) , which is incorporated herein by 
reference, Kanamycin-resijstant lines were analyzed for 

20 the presence of the 35S-AGL8 construct by PGR using a 
primer specific for the 35S promoter and a primer 
specific for the AGL8 cDNA, which produced two fragments 
of 850 and 550 bp in the 35S-AGL8 transgenic plants. 
These fragments were absent in plants that had not. .been 

25 transformed with the 3 5S-AGL8 construct. 

The phenotype of approximately 35 35S::AGL8 
lines was analyzed. Of the 35 lines, 7 lines exhibited a 
complete lack of dehiscence. In these lines, the mature 
fruits did not release their seeds unless opened 
30 manually. Several of the remaining 35S::AGL8 lines 

exhibited delayed dehiscence, whereby seeds were released 
at least a week later than in wild type Arabidopsis 

plants. 
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EXAMPLE II 

PRODUCTION OF AN AR ABIDOPSTS apriJ aql5 double mutant 
DISPLAYIN G A COMPLETE LACK OF DEHISCENCE . ' 

This example describes the production of an 
5 agll agl5 double mutant displaying a complete lack of 
normal dehiscence. 

A. Production of an api5 mutant by homologous . ■ 
recombination 

A PCR-based assay of transgenic, plants was used 
10 to identify targeted insertions into AGL5 as described in 
Kempin et al.. Nature 389:802-803 (1997), which is 
incorporated herein by reference. The targeting 
construct consisted of a kanamycin-resistance cassette 
that was inserted between approximately 3 kb 
15 and 2 kb segments representing the 5' and 3* regions of 
the AGL5 gene, respectively. A successfully targeted 
insertion produces a 1.6 kb deletion within the AGL5 gene 
such that the targeted allele encodes only the first 42 
of 246 amino acid residues, and only 26 of the 56 amino 
20 acids comprising the DNA-binding MADS-domain. The 

recombination event also. results in the insertion of the 
2.5 kb kanamycin-^resistance cassette within the AGL5 
coding sequence. 

750 kanamycin-resistant transgenic lines were 
25 produced by Agrobacterium-mediated transformation, and 

pools of transformants were analyzed using a PGR assay as 
described below to determine if any of these primary 
transformants had generated the desired targeted 
insertion into AGL5. A single line was identified that 
30 appeared to contain the anticipated insertion, and this 
line was allowed to self -pollinate to permit further 
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analyses in subsequent generations. Genomic DNA from the 
homozygous mutant plants was analyzed with more than four 
different restriction enzymes and by several distinct PGR 
amplifications, and all, data were consistent with the 

■ • 

» 

5 desired targeting event. The regions flanking the AGL5 
gene also were analyzed to verify that there were no 
detectable deletions or rearrangements of sequences 
outside of AGL5. • 

■ * 

The kanamycin-resistance cassette within the 
10 AGL5 targeting construct contains sequences that specify 
transcription termination such that little or no AGL5 RNA . 
was expected in the homozygous mutant plants. Using a 
probe specific for the 3' portion of the AGL5 cDNA, AGL5 
transcripts were detected in wild-type but not in agl5 
15 mutant plants. These data indicate that the targeted 

disruption of the AGL5 gene represents a loss-of -function, 
allele. 

Characterization of the agl5 line indicated 
that the phenotype of this transgenic was not different 
20 from wild type Arabidopsis . 

The AGL5 knockout (KO) construct was prepared 
in vector pZM104A, which carries the kanamycin-resistance 
■ cassistte flanked by several cloning sites (Miao and Lam, 
Plant J, 7:359-365 (1995), which is incorporated herein 

25 by reference) . Vector pZM104A also contains the gene 
encoding p-glucuronidase (GUS), which allows the 
differentiation of non-homologous from homologous 
integfration events. The 3 kb region representing the 5' 
portion of AGL5 was obtained by PGR amplification using 

30 primer SEQ ID NO: 11 .(5'-CGGATAGCTCGAATATCG-3' ) and primer 
SEQ ID NO: 12 (5 ' -AACCATTGCGTCGTTTGC-3 ' ) . The resulting 
fragment was cloned into vector pCRII (Invitrogen) , and 
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an EcoRI fragment excised and inserted into the EcoRI 
' site of pZM104A. The 3' portion of AGL5 was excised as 
an Xbal fragment from an AGL5 genomic clone in the. vector 
pCIT30 (Ma et al. , Gene 117:161-167 (1992),. which is 
5 incorporated by reference herein) and inserted into the 
Xbal site of pZM104A. The resulting plasmid, designated 
AGL5 KO, was used in AgroJbacteriujn-mediated infiltration 
of wild-type Arabidopsis plants of the Columbia ecotype. 
. The knockout construct was derived from Landsberg erecta 
10 . genomic DNA. 

Plants containing a homologous recombination 

i 

event at the AGL5 genomic locus were identified as 
follows. Approximately 750 primary (Tl) 

kanamycin-resistant transf ormants were selected, and DNA 

15 was extracted from individual leaves in pools 

representing ten plants as described in Edwards et al. / 
Nucleic Acids Research 19:1349 (1991), which , is 
incorporated by reference herein. To identify a pool 
that contained a candidate targeted disruption, isolated 

20 DNAs were subjected to PGR amplification using primer SEQ 
ID NO: 13 (5 ' -GTAATTACCAGGCAAGGACTCTCC-3 ' ) , which 
represents AGL5 genomic sequence that is not contained 
within the AGL5 KO construct, and primer SEQ ID NO: 14 
(5»-GTCATCGGCGGGGGTCATAACGTG-3* ) , which is specific for. 

25 the kanamycin-resistance cassette. Amplified 

products were size fractionated on agarose gels, and used 
for standard DNA blotting assays with probe 1. One pool 
of ten plants revealed the anticipated hybridizing band 
of the correct size, and this pool was subsequently 

30 broken down into individual plants. A single 

(Tl) plant was identified that appeared to contain the 

desired event, and this plant was allowed to 

self -pollinate for analyses in subsequent generations. 
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This Tl plant was shown to contain the GUS-reporter gene, 
indicating that in addition to the putative- 
homologous integration event, there were independent 

non-homologous events. Segregation in the subsequent 

»■ 

5 generations allowed the identification of plants that no 
longer contained the GUS-reporter gene, and it was these 
lines that were used for subsequent analyses. 

Plants homozygous for the disruption were 
identified by PGR amplification using primers SEQ ID 

10 NO: 15 .{5'-GAGGATAGAGAACACTACGAATCG-3' ) and SEQ ID NO: 16 

(5'-CAGGTCAAGTCAATAGATTC-3'), which yielded a single 1.5 . 
kb product in wild type plants, and a single 2.6 kb 
product in the mutant. Further confirmation that these 
plants contained the desired disruption was obtained by 

15 PGR amplification with primers SEQ ID NO: 17 

(5'-CAGAATTTAGTGAATAATATTG-3' ) and SEQ ID N0:14i! which 
gave the expected amplified product in the mutant but no - 
product in wild-type plants. 

To confirm that the desired disruption had 
20 occurred, a series of genomic DNA blots representing 
wild-type and homozygous mutant (T4 generation) plants 
were analyzed. Probe 1 hybridized to the expected 3.9 kb 
Xbal fragment in wild-type and mutant plants, whereas the 
1.3 kb Xbal fragment was present only in wild-type. This 
25 same probe hybridized to a 6 kb EcoRI fragment in 
wild-type and to the expected 4.1 and 2.8 kb EcoRI 
fragments in the mutant. Additional digests 
with Bglll and with Hindi I I confirmed that the mutant 
plants contained the desired targeted event. To confirm 
30 that there were no detectable deletions or rearrangements 
outside the targeted region, genomic DNA blots of wild 
type and homozygous mutant plants were further analyzed. 
Probe 2 hybridized in wild-type and mutant DNAs to the 
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expected 2.9 kb XmnI fragment, the 1.5 kb and 0.4 kb 
' Hindi fragments, and the 0.6 kb Hindlll fragment. Probe. 
3 hybridized in wild-type and mutant DNAs to the 9 kb 
Seal fragment, the 3.9 kb Xbal fragment, and the 
5 1.8 kb Ndel fragments. The faintly-hybridizing bands in- 
the Seal digests represent fragments that span the 
insertion site, and are, as expected, different sizes in 
wild-type and agl5 mutant plants. 

i 

RNA blotting analyses were performed as 
10 follows. Approximately 6 /zg of polyA+ RNA was purified 
using Dynabeads (Dynal) from wild-type and agl5 mutant 
inflorescences, size fractionated and hybridized using" 
standard procedures (Crawford et al., Proc, Natl, Acad. 
Sci. USA 83:8073-8076 (1986),. which is incorporated 
15 herein by reference) using a gel-purified 450 bp 
Hindlll-EcoRI fragment from pCIT2242 (Ma et al., 
supra, 1991) specific for the 3' endof the AGL5 cDNA. 
The same filter was subsequently stripped and 
re-hybridized with a tubulin-specif ic probe (Marks et 
20 al.. Plant Mol. Biol. 10:91-104 (1987), which is 

incorporated herein by reference) . Hybridization with 
the tubulin probe verified that approximately equal 
amounts of RNA were present in each lane. 

B- Production o f an aoll mutant 

25 A PCR^based screen was used to identify a T-DNA 

inse^rtion into , the AGLl gene essentially as described in 
Krysan et al., supra , 1996. 

RNA blotting analyses demonstrated that AGLl 
RNA was not expressed. The agll mutant displayed 
30 essentially a wild type phenotype. 
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C. Production and chara cterization of an agll aal5 double 

agll agl5 double mutants were generated by. 
crossing the agll and agl5 single .mutants . RNA blotting 
5 experiments of the agll agl5 double mutant are performed 
. as described above • The results indicate that neither 
AGLl nor AGL5' RNA is expressed in the agll agl5 double 
mutant. 

* ♦ ' ■ " 

In contrast to the agll and agl5 single 
10 mutants, which had essentially the phenotype of wild type • 
Arahidopsis, analyses of the agll agl5 double mutant by . 
scanning electron microscopy indicated that the 
dehiscence zone failed to develop normally. Furthermore, 
•the mature fruits of the agll agl5 double mutant failed 
15 to dehisce. This delayed seed dispersal phenotype was 
similar to AGL8 gain-of-f unction phenotype seen in 
35S-AGL8 transgenic plants. These results indicate that 
the AGLl and AGL5 genes are functionally redundant and. 
that their encoded gene products regulate pod dehiscence. . 
20 .The similarity of the 35S::AGL8 and agll agl5 double 

mutant phenotypes, as well the yeast two-hybrid results 
described below, indicate that AGLl and AGL8 or AGL5 and 
AGL8 can interact to regulate the dehiscence process. 

I 

■D. Analysis of dehiscence phenotypes under various 
25 conditions 

Studies of pod dehiscence in Brassica napus L. 
using transmission electron microscopic analyses have 
shown that the middle lamella of the dehiscence zone 
cells degenerates during dehiscence, allowing the valves 
30 to separate from the ireplum (Petersen et al., 
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supra, 1996). Similar analyses are performed on the agll 
agl5 double mutant as well as wild type Arabidopsis and 
agil.and agl5 single mutants. 

Previous studies have shown that pod dehiscence 
5 is greater when temperatures are high and the relative 
humidity is low. The dehiscence phenotype of the agll 
agl5 double mutant described above was observed for 
plants grown under continuous-light at 25 degrees C. In 
order to determine if the phenotype of agll agl5 double 
10 .mutants is sensitive to environmental conditions, the 

analyses described above are repeated under various • 
environmental conditions including varying temperature, . 
varying humidity and short-day versus continuous light 
conditions. 



15 EXAMPLE III 

PRODUCTION OF A TRANSGENIC ARABIDOPSIS PLA NT EXPRESSING 
AGL8 UNDER CONTROL OF THE AGLl PROMOTER 

This example demonstrates that a transgenic 
seed plant expressing AGL8 under control of a dehiscence 
20 zone -selective promoter is characterized by delayed seed 
dispersal. 

AGL1::AGL8 transgenic plantis 

Ectopic expression of AGL8 under control of the 
35S promoter prevents pod shatter since the dehiscence 

25 zone fails to differentiate normally. However, 
constitutive AGLB expression conferred by the 358 
promoter also results in other changes, including, early 
flowering. In order to specifically control dehiscence, 
AGL8 is expressed from a dehiscence zone -selective 

30 regulatory element, such as one derived from a regulated 
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promoter that is normally expressed in valve margin, as 
, described below. 

An AGL8 expression construct under control of 
the dehiscence zone-selective 2.5 kb AGLl promoter 

5 fragment and first AGLl intronic sequence is prepared as 

♦ 

follows. The 2.5 kb AGLl promoter fragment is amplified 

by PGR with primers AGLlpds (SEQ ID NO: 18; 
5 • -GCCAGAGATAATGCTATTCC-3 ' ) and AGLlpus (SEQ ID NO: 19; 
5'-CATTGATCCATATATGACATCAC-3') , and the first coding exon 
10 of AGL8 is amplified with oligos AGLSeds (SEQ ID NO: 20; 

5 ' -GTGATGTCATATATGGATCAATGGGAAGAGGTAGGGTTCAG-3 ' ) and 

I 

AGLBeus { SEQ ID NO : 2 1 ; 5 ' - CAAGAGTCGGTGGAATATTCG- 3 • ) . In 
addition, the first intron of AGLl, which can contain 

regulatory elements, is amplified with oligos AGLl ids 
15 (SEQ ID NO: 22; 5 • -CGAATATTCCACCGACTCTTGGTACGCTTC 
TCCTACTCTAT- 3 ' ) and AGLliup (SEQ ID NO: 23; 
5 » ~CTAATAAGTAAGATCGCGGAA-3 ' ) . The remainder of the AGL8 

coding region is amplified with oligos AGLBrds (SEQ ID 
NO : 2 4 ; 5 ' - TTCCGCGATCTTACTTATTAGCATGGAGAGGATACTTGAAC - 3 » ) 
20 and 0AM14 (SEQ ID NO: 10) . Using PGR with oligos AGLlpds 
(SEQ ID N0:18) and 0AM14 (SEQ ID NO:10), the four 
fragments are combined in the following order: AGLl 

promoter, first AGL6 exon, first AGLl intron and 

remainder of coding sequence. The resulting 4.6 kb 

25 fragment is cloned into vector pGFM83, which is a vector 
based on pBIN19 that is modified to contain a BASTA 
resistance gene and 3* NOS termination sequence. 

A second AGL8 expression construct, in which 
AGL8 is under control of the dehiscence zone -selective 
30 2.5 kb AGLl promoter fragment alone, is prepared as 

follows. The 2*5 kb AGLl promoter fragment is amplified 

by PGR with oligo AGLlpds (SEQ ID NO: 18) and AGLlpus (SEQ 
ID NO: 19), and the coding region of AGL8 amplified with 
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oligos AGLSeds" (SEQ ID NO:20) and 0Ajyil4 (SEQ ID NO:10) . 
Using PGR with oligos AGLlpds (SEQ ID NO: 18) and 0AM14 
(SEQ ID NO:10), the 3.5 kb fragment is cloned into vector 
pCFM83. 

• * 

5 Arabidopsis plants are transformed with the two 

AGLl-AGLB constructs described above. BASTA resistant 
plants containing the AGL1::AGL8 transgene with or 
without the AGLl intron are selected. Phenotypic 
analysis indicates that transformed plants containing 
10 either of these constructs are characterized. by delayed 
dehiscence. However, the AGL1::AGL8 transgenic plants 
differ from 35S::AGL8 transgenic plants. in that an • 
enlarged fruit or early flowering phenotype generally is 
not seen. 

15 These results indicate that a transgenic seed 

plant expressing AGLB under control of an AGLl dehiscence 

zone-selective regulatory element is characterized by 
delayed seed dispersal. 

EXAMPLE IV 

20 AGLB INTERACTS WITH AGL5 IN YEAST 

This example demonstrates that, in a yeast 
two-hybrid system, the AGL8 gene product interacts with 
AGL5. 

The '^interaction trap" of Finley and Brent 
25 (Gene Pr obes: A Practical Approach (1994); see, also 

Gyuris et al.. Cell 75:791-803 (1993)) is a variation of 
the yeast two-hybrid system of Fields and Song, Nature 
340:245-246 (1989). In this system, a first protein is 
fused to a DNA-binding domain, and a second is fused to a 
30 transcriptional activation domain. An interaction 
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between the Arabidopsis AGL5 and AGL8 gene products was 
assayed by activation pf a lacZ reporter gene. 

The ^'bait" and "prey" constructs were prepared 
in single copy centromere plasmids pBI-880 and pBI-77.1, 
5 respectively, which each contain the constitutive ADHl 
promoter and are essentially as described by Chevray and 
Nathans, Proc, Natl, Acad. Sci , USA 89:5789-5793 (1992). 
The bait construct contains the GAL4 DNA-binding domain 
{amino acids 1 to 147) fused to the full-length AGL8 

10 .coding, sequence. The prey construct has the full-length 
coding sequence of AGL5 fused to the GAL4 transcriptional • 
activation, domain (amino acids 768-881),. following a 
nuclear localization sequence. The bait and prey 
constructs were assayed in the YPB2 strain of S. 

15 cerevisiae, which is deficient for GAL4 and GAL80 and 
which contains an integrated lacZ reporter gene under 
control of GALl promoter elements (Feilotter et al.. 
Nucleic Acids Research 22;1502-1503 (1994)). 

An interaction of the AGL8 "bait" and AGL5 
20 "prey" was demonstrated in the YPB2 strain by the 

development of blue colonies on X-GAL containing media. 
Control "bait "-"prey" combinations, including the 
• GAL4 (1-147) DNA binding domain and GAL4 transcriptional 
activation domain only produced only white colonies. 
25 these results demonstrate that AGL8 can interact with 
AGL5 in yeast and indicate that the AGL8 and AGL5 plant 
MADS box gene products also can interact in seed plants. 

All journal article, reference, and patent 
citations provided above, in parentheses or otherwise, 
30 whether previously stated or not, are incorporated herein 
by reference. 
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Although the invention has been described with 
reference to the examples above, it should be understood 
that various modifications can be made without departing 
from the spirit of the. invention. Accordingly, . the 
invention is limited only by the following claims. 
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(D - GENERAL INFORMATION:^ 

(i) APPLICANT: The Regents of the University of California 

»• 

(ii) TITLE OF INVENTION: Seed Plants Characterized by Delayed 

Seed Dispersal 

(iii) NUMBER OF SEQUENCES: 24 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Campbell & Flores LLP 

(B) STREET: 4370 La Jolla Village Drive, Suite 700 

(C) CITY: San Diego 

(D) STATE: California 

(E) COUNTRY: United States 

(F) ZIP: 92122 

(v) COMPUTER READABLE FORM: . 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/051,030 

(B) FILING DATE: 27-JUN-1997 

(A) APPLICATION NUMBER: US 09/067,800 

(B) FILING DATE: 28-APR-1998 

(viii) ATTORNEY /AGENT INFORMATION: 
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(B) REGISTRATION NUMBER: 31,815 

(C) REFERENCE/ DOCKET NUMBER: FP-UD 3188 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619) 535-9001 

(B) TELEFAX: (619) 535-8949 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1062 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 101. .827 
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(ix) FEATURE: 

(A) 'NAME/KEY: niisc_f eature 

(B) LOCATION: 1062 

(D) OTHER INFORMATION: /note= "There is a poly(A) tail at 
the end. " 

« ■ ■ 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: l.,1062 

(D) OTHER INFORMATION: /note= "Nucleotide and Deduced 
Amino Acid Sequences of the. AGL8 cDNA clone." 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

I 

CCCAGAGAGA CATAAGAAAG AAAGAGAGAG AGAGATACTT TGGTCATTTC AGGGTTGTCG 60 

TTTCTCTCTC TTGTTCTTGA GATTTTGAAG AGAGAGAGAT ATG GGA AGA GGT AGG 115 

Met Gly Arg Gly Arg 

. 1 ' 5 - 

GTT CAG CTG AAG AGG ATA GAG AAC AAG ATC AAT AGG GAA GTT ACT TTC 163 
Val Gin Leu Lys Arg lie Glu Asn Lys lie Asn Arg Gin Val Thr Phe 

10 15 20 

TCA AAG AGA AGG TCT GGT TTG CTC AAG AAA GCT CAT GAG ATC TCT GTT 211 
Ser Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala His Glu. lie Ser Val 

25 30 35 

CTC TGC GAT GCT GAG. GTT GCT CTC ATC GTC TTC TCT TCC AAA GGC AAA 259 
Leu Cys Asp Ala Glu Val Ala Leu He Val Phe Ser Ser Lys Gly Lys 
40 45 ; 50 

CTC TTC GAA TAT TCC ACC GAC TCT TGC ATG GAG AGG ATA CTT GAA CGC 307 

Leu Phe Glu.Tyr Ser Thr Asp Ser Cys Met Glu Arg He Leu Glu Arg 
55 60 . 65 . 

TAT GAT CGC TAT TTA TAT TCA GAC AAA CAA CTT GTT GGC CGA GAC GTT 355 
Tyr Asp Arg Tyr Leu Tyr Ser Asp Lys Gin Leu Val Gly Arg Asp Val 
70 75 . 80 85 

TCA CAA AGT GAA AAT TGG GTT CTA GAA CAT GCT AAG CTC AAG GCA AGA 403 
Ser Gin Ser Glu Asn Trp Val Leu Glu His Ala Lys Leu Lys Ala Arg 

90 95 100 

GTT GAG GTA CTT GAG AAG AAC AAA AGG AAT TTT ATG GGG GAA GAT CTT 451 
Val Glu Val Leu Glu Lys Asn Lys Arg Asn Phe Met Gly Glu Asp Leu 

105 110 115 

GAT TCG TTG AGC TTG AAG GAG CTC CAA AGC TTG GAG CAT CAG CTC GAT .499 
Asp Ser Leu Ser Leu Lys Glu Leu Gin Ser Leu Glu His Gin Leu Asp 
120 125 130 

GCA GCT ATC AAG AGC ATT AGG TCA AGA AAG AAC CAA GCT ATG TTC GAA 54? 
7U.a Ala He Lys Ser He Arg Ser Arg Lys Asn Gin Ala Met Phe Glu 
135 140 145 

TCC ATA TCT GCG CTC CAG AAG AAG GAT AAA GCC TTG CAA GAT CAC AAC 595 
Ser He Ser Ala Leu Gin Lys Lys Asp Lys Ala Leu Gin Asp His Asn 
150 155 160 165 

AAT TCG CTT CTC AAA AAG ATT AAG GAG AGG GAG AAG AAA ACG GGT CAG 643 
Asn Ser Leu Leu Lys Lys He Lys Glu Arg Glu Lys Lys Thr Gly Gin 



wo 99/00502 PCTAJS98/13208 

78 

170 "* 175 180 

CAA GAA GGA CAA TTA GTC CAA TGC TCC AAC TCT TCT TCA GTT CTT CTG - 691 

Gin Glu Giy Gin Leu Val Gin Cys Ser Asri Ser Ser Ser Val Leu Leu 

185 190 195 

CCT CAA TAG TGC GTA ACC TCC TCC AGA GAT GGC TTT GTG GAG AGA GTT 739 . 

Pro Gin Tyr Cys Val Thr Ser Ser Arg Asp Gly Phe Val Glu Arg Val 

200 205 *. 210 

GGG GGA GAG AAC GGT GGT GCA TCG TCG TTG ACG GAA CCA AAC TCT CTG 787 
Gly Gly Glu Asn Gly Gly Ala Ser Ser Leu Thr Glu Pro Asn Ser Leu 
215 220 225 

CTT CCG GCT TGG ATG TTA CGT CCT ACC ACT ACG AAC GAG T AGAACTATCT 837 
Leu Pro Ala Trp Met Leu Arg Pro Thr thr Thr Asn Glu 
230 235 240 

CACTCTTTAT AATATAATGA TAATATAATT AATGTTTAAT ATTTTCATAA CATTCAGCAT 897 

TTTTTTGGTG ACTTATACTC ATTATTAATA CCGATATGTT TTAGCTAGTC ATATTATATG 957 , 

TATGATGGAA CTCCGTTGTC GAGACGTATG TACGTAAGCT ATCATTAGAT TCACTGCGTC 1017 

TTAAGAACAA AGATTCATAT CTTGGTAATG ATTTCTCATG AAATA 1062 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

. (A) LENGTH: 242 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Arg Gly Arg Val Gin Leu Lys Arg lie Glu Asn Lys lie Asn 
15 10 .15 

Arg Gin Val Thr Phe Ser Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala 

20 . 25 30 

His Glu lie Ser Val Leu Cys Asp Ala Glu Val Ala Leu He Val Phe 
35 40 45 

Ser Ser Lys Gly Lys Leu Phe Glu Tyr Ser Thr Asp Ser Cys Met Glu 
50 55 60 

Arg He Leu Glu Arg Tyr Asp Arg Tyr Leu Tyr Ser Asp Lys Gin Leu 
65 70 75 80 

Val Gly Arg Asp Val Ser Gin Ser Glu Asn Trp Val Leu Glu His Ala 

85 90. 95 

Lys Leu Lys Ala Arg Val Glu Val Leu Glu Lys Asn Lys TLrg Asn Phe 

100 105 110 

Met Gly Glu Asp Leu Asp Ser Leu Ser Leu Lys Glu Leu Gin Ser Leu 
115 120. ■ 125 

Glu His Gin Leu Asp Ala Ala He Lys Ser He Arg Ser Arg Lys Asn 
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130 



135 



140 



Gin Ala Met Phe Glu Ser lie Ser Ala Leu Gin Lys Lys Asp Lys Ala 
145 ' 150 155 160 
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Leu Gin Asp His Asn Asn Ser Leu.LeuLys Lys lie Lys Glu Arg Glu 

165 170 175 

Lys Lys Thr Gly Gin Gin Glu Gly Gin Leu Val Gin Cys Ser Asn Ser 

180 185 190 

Ser Ser Val Leu Leu Pro Gin Tyr Cys Val Thr Ser Ser Arg Asp Gly 

195 200 . 205 . 

Phe Val Glu Arg Val Gly Gly Glu Asn Gly Gly Ala Ser Ser Leu Thr 
210 ■ 215 220 

Glu Pro. Asn Ser Leu Leu Pro Ala Trp Met Leu Arg Pro Thr Thr Thr 
225 230 235 240 

Asn Glu 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..5622 

(D) OTHER INFORMATION: /label= AGLl^promoter 

/note= "Nucleotide sequence of the AGLl promoter." 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 








AGATCTGCAA 


CAGTGAAAAG 


AGAAAACAAA 


ATGGACTTGA 


AGAGGTTTTG 


ACAATGCCAG 


60 


AGATAATGCT 


TATTCCCTAA 


TATGTTGCCA 


GCCAAGTGTC 


AAATTGGCTT 


TTTAAATATG 


120 


GATTTCTGTA 


TCAGTGGTCA 


TATTTGTGGA 


TCCAACGTAT 


TCATCATC/^ 


GTTCTCAAGT 


180 


TTGCTTTCAG 


TGCAATTCTA 


ATTCACACGT 


TTAACTTTAA 


CATGCATGTC 


ATTATAATTA 


240 


CTTCTTCACT 


AAGACACAAT 


ACGGCAAACC 


TTTCAGATTA 


TATTAATCTC 


CATAAATGAA 


300 


ATAATTAACC 


TCATAATCAA 


GATTCAATGT 


TTCTAAATAT 


ATATGGACAA 


AATTTACACG 


360 


GAAGATTAGA 


TACGTATATT 


AGTAGATTTA 


GTCTTTCGTT 


TGTGCGATAA 


GATTAACCAC 


420 


CTCATAGATA 


GTAATATCAT 


TGTCAAATTC 


CTCTCGGTTT 


AGTCGCTAAA 


TTGTATCTTT 


480 


TTTAAGCCTA 


AAAGTAGTGT 


ATTCGCATAT 


GACTTATCGT 


CCTAACTTTT 


TTTTTAATTA 


540 


ACAAAAAAAT 


CGAAAAGAAA 


ATAATCTGTT 


7l?VATATTTTT 


TAAGTACTCC 


ATTAAGTTTA 


600 


GTTTCTATTT 


AAAAAATGCT 


TGAAATTTGA 


CAGTTATGTT 


CAACAATTTT 


GAATCATGAG 


660 


CGATGTCTAG 


ATACTCAGAA 


TTTAATCAAG 


ATGTCTTATC 


AAATTTGTTG 


TCACTCGAGG " 


720 


ACCCACGCAA 


AAGAAAAGAC 


TAATATGATT 


TTTATTTGGT 


CTGGATATTT 


TTGTAGAGGA 


780 
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TGAAACTAAG AGAGTGAAAG ATTCGAAATC CACAATGTTC AAGAGAGCTC AAAGCAAAAA 840 . 

« • 

GAAAAATGAA GATGAAGGAC TAAAGAACAA TAAGCAACTA CTTATACCCT ATTTCCATAA 900 . 

AGGATTCAGG TACTAGGAGA AGTTGAGGCA AGTTNNNNNN NATTGATTCA AATTTTCATT . 960 

TATTTTTACA ATTTAATTCA CCTAAGTTAT TATGCATTTC TCATCATTGG TACATTTTCT 1020 

GTATAGCGTA TTTACATATA TGAAATAAAT TAAATATGTC CTCACGTTGC AAGTAGTTAA 1080 

TGAATGTCCC CACGCAAAAA AAAATCC.CTC CAAATATGTC CACCTTTTCT TTTGTTTTTA 114 0 
ATTCCAAAAT TACCATA7\AC- TTTTGGTTTA CAAAAGATTT CTAGAAATTG AGGAAGATAT • 1200 

CCTAAATGAT TCATGAATCC TTCAATAATC TGAAGTTTGC GATATTTTCG ATTTTCTTCA 1260 

AGAGTTGCGA TATTTGTAAT TTGGTGACCT .TAAACTTTTT TTGATAAAGA GTAAACGTTT 1320 

TTTCTTAAAA GTAAAACTTG ATTTTATGTT TTAGGGTTCT AGCTCAACTT TGTATTATAT 1380. 

TTCTTGCAAA AAGAGTTCGT TAACTGCATT CTTCAACACT ATAAAGTGAT TATCAAAAAC 1440 

ATCTTCATGA ACATTAAGAA AAACAATATT TGGTTTCGGT TAGAGCTTGG TTTTGCTXGG 1500 

CTTGATTCAC ATACCCATTC TAGACTTTGG CATAAATTTG ATACGATAGA GAGTATCTAA 1560 

TGGTAATGCA GAAGGGTAAA AAAAGGAAGA GAGAAAAGGT GAGAAAGATT ACCAAAAATA 1620 

AGGAGTTTCA AAAGATGGTT CTGATGAGAA ACAGAGCCCA TCCCTCTCCT TTTCCCCTTC 1680 

, ^ » 

CCATGAAAGA AATCGGATGG TCCTCCTTCA ATGTCCTCCA CCTACTCTTC TCTTCTTTCT 1740 

TTTTTTCTTT CTTATTATTA ACCATTTAAT TAATTTCCCC TTCAATTTCA GTTTCTAGTT 1800 

CTGTAAAAAG AAAATACACA TCTCACTTAT AGATATCCAT ATCTATTTAT ATGCATGTAT 1860 

AGAGAATAAA AAAGTGTGAG TTTCTAGGTA TGTTGAGTAT GTGCTGTTTG GACAATTGTT 1920. 

AGATGATCTG TCCATTTTTT TCTTTTTTCT TCTGTGTATA AATATATTTG AGCACAAAGA 1980 

AAAACTAATA ACCTTCTGTT TTCAGCAACT AGGGTCTTAT AACCTTCAAA GAAATATTCC 2040 

TTCAATTGAA AACCCATAAA CCAAAATAGA TATTACAAAA GGAAAGAGAG ATATTTTCAA 2100 . 

GAACAACATA ATTAGAAAAG CAGAAGCAGC AGTTAAGTGG TACTGAGATA AATGATATAG 2160 

TTTCTCTTCA AGAACAGTTT CTCATTACCC ACCTTCTCCT TTTTGCTGAT CTATCGTAAT 2220 

CTTGAGAACT CAGGTAAGGT TGTGAATATT ATGCACCATT CATTAACCCT AAAAATAAGA 2280- 
GATTTAAAAT AAATGTTTCT TCTTTCTCTG ATTCTTGTGT AACCAATTCA TGGGTTTGAT * 2340 . 

ATGTTTCTTG GTTATTGCTT ATCAACAAAG AGATTTGATC ATTATAAAGT AGATTAATAA 2400 

CTCTTAAACA CACAAAGTTT CTTTATTTTT TAGTTACATC CCTAATTCTA GACCAGAACA 2460 

TGGATTTGAT CTATTTCTTG GTTATGTATC TTGATCAGGA AAAGGGATTT GATGATCAAG 2520 

ATTAGCCTTC TCTCTCTCTC TCTAGATATC TTTCTTGAAT TTAGAAATCT TTATTTAATT 2580 

♦ 

ATTTGGTGAT GTCATATATG GATCAATGGA GGAAGGTGGG AGTAGTCACG ACGCAGAGAG 2640 

TAGCAAGAAA CTAGGGAGAG GGAAAATAGA GATAAAGAGG ATAGAGAACA CAACAAATCG 2700 
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TCAAGTTACT 


TTCTGCAAAC 


GACGCAATGG 


TCTTCTCAAG 


AAAGCTTATG 


AACTCTCTGT 


2760 


CTTGTGTGAT 


GCCGAAGTTG 


CCCTCGTCAT 


CTTCTCCACT 


CGTGGCCGTC 


TCTATGAGTA 


. 2820 


CGCCAACAAC 


AGGTACGCTT 


CTCCTACTCT 


ATTTCTTGAT 


CTTGTTTTCT' 


TAATTTTAAC 


2880 


TAAACAAGAT 


CCTAGTTCAA 


ATGATAACAA 


AGTGGGGATT 


GAGAGCCAAG 


ATTAGjGGTTT 


2940 


GGTTAATTTA 


GAAAACCAGA 


TTTCACTTGT 


TGATACATTT 


AATATCTCTC 


TAGCTAGATT 


3000 


TAGTACTCTC 


TCCTCTATAT 


ATGTGTGGGT 


GTGTGTGTAA 


GTGTGTATAT 


GTATGCAAAT 


3060 


GCAAGAAGAA 


GAAGAAAAAG 


TTATCTTGTC 


TTCTCAAATT 


CTGATCAGCT 


TTGACCTTAG 


3120 


TTTCACTCTT 


TTTTCTGCAA 


ATCATTTGAA 


CCTGATGCAT 


GTCAGTTTCT 


ACAATACACT 


3180 


TTTAATTTTG. 


ACGGCCCATC 


AAATTTCCTA 


GGGTTTACTT 


CAGTGAACAA 


AATTGGGTTC 


3240 


TTGACACGAT 


TTAGCATGTA 

■ 


TATATAAAAA 


TAGGGGATGA 


TCAAGACTTA 


TGTAACCTCT 


3300- 


GTCTGGTGAA 


ACTAGGGACA 


AAGTCTACTG. 


ATGAGTTGTC 


ACTAGGGATC 


CATTTGATCA 


3360 


TTTAATCCCA 


ACAAAAATGA 


AACAAAATTT 


TGAGAATTTA 


TATGCTGAAG 


TTTTTCAACC 


3420 


CTCTTTTTTA 


AATAACTTTA 


TATTATGTAG 


ATTTGTATTT 


AGGGTAATTT 


GTCCAACTAG 


3480 


AAGTCCTAAA 


AATCAATAAA 


CACACGGATG 


ACTTTGTCTA 


ACATTGTATC 


AGTCATCAAA 


3540 


TGTAAAATTG 


TACAAATAAT 


GAAATTAAAG 


ATTTAGTCTC 


TTTTATTTTT 


TTTGTTTAGG 


3600 


GTGTATATAT 


ATATATATAT 


GTATATTTGT 


TGCATTGATA 


TATCAATGAG 


AGGGAGAGAA 


3660 


CTCAGAGAAG 


TGTCGGAAAT 


TAAAATGGTA 


CGAGCCAATT 


GGAATCTCTG 


GCATTCTGAG 


3720 


CTTCATTTGT 


TTGTTATTAG AAAAAAAAAA 


AA/\AAATCCT 


TTAAAGATAC 


CTTCATGATG 


3780 


acattgaAtc atgtaatata cacgatacat 


GGTCTAATTC 


CTCCTCAAAC 


CCTAATTACC 


3840 


AATTTCGAAA 


CCATAATATT 


TACTAGTATG 


TTTATATATC 


CTTACTTTAA 


GACATTGTTT 


3900 


GTTTATAATA 


CCTTGTGAAT 


TAAGAAAAAA 


AAAAAAAAAC 


TTGTGGATCT 


ATTCAAGCCA 


.3960 


TGTGTTAGAA 


TAAATTTATA 


AATTTTCTCC 


TCGTACTGGT 


CAGATATTGG 


TCCAAACTCC 


4020 


AAAGCCTTCC 


CTTTTCAGGA AAAAAAACAT 


TTCGAAATTA 


ACTCTAATTA 


ATCAAGAATT 


4080 


TCCTACAATG 


TATACATCTA 


ATGTTTTTTC 


CGCGATCTTA 


CTTATTAGTG 


TGAGGGGTAC 


4140 


AATTGAAAGG 


TACAAGAAAG 


CTTGTTCCGA 


TGCCGTCAAC 


CCTCCTTCCG 


TCACCGAAGC 


4200 


TAATACTCAG 


GTACCAATTT 


ATATTGTTTG 


ATTCTCTTTG 


TTTTATCTTC 


TTCTTTTCAT 


4260 


TATATATATG 


ATCAACAAAA 


AATATAACCT 










GAAACGGTTT 


CGTTATGGTG 


TTTGAATACA 


TGGATTTTTG 


AAGTACTATC 


AGCAAGAAGC 


4380 


CTCTAAGCTT 


CGGAGGCAGA 


TTCGAGATAT 


TCAGAATTCA 


AATAGGTAAT 


TCATTAACTT 


4440 


ttcatgaact 


CTTCGATTTG 


GTATTAGGTC 


ACTTAATTTG 


GTGTCGGTCC 


AAAAGTCCGC 


4500 


TTGTAGTTTT 


CTTTAGTVAGT TGTTTTGTTT 


AATGTTCATG 


TTTACAAATT 


GAAGGCATAT 


4560 


TGTTGGGGAA 


TCACTTGGTT 


CCTTGAACTT 


CAAGGAACTC 


AAAAACCTAG 


AAGGACGTCT 


4620 
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TGAAAAAGGA ATCAGCCGTG TCCGCTCCAA AAAGGTAAAA TCTACGTTGC TCTCTCTCTG 4 680 • 

TGTCTCTGTC TCTCTCTCTA TATATAGTCC CTTAGTTTAT ATAGTTCATC ACCCTTTTGT 4740 ■ 
GAGAATTTTG CAGAATGAGC TGTTAGTGGC AGAGATAGAG TATATGCAGA AGAGGGTAAG . 4800 

* * * I 

'it* 

AACGTTTCTC CCATTCCAAG TAATTAGATC TTTCTTCGTC TTTGTGAGGG TTTGAGTTTT 4 860 

CCCATAAATC ATGTGTAGGA AATGGAGTTG CAACACAATA ACATGTACCT GCGAGCAAAG 4 920 ' 

GTTAGCCACG TTCTGTTCCA AATCTTAATC TCAATATCTA CTCTTTTCTT CATTGTATAA 4 980 

CTAAGATAAC . GTGAATAACA AGAAAACTTT TGTTTTTGGG TTTAATAGAT AGCCGAAGGC 504 0 

GCCAGATTGA ATCCGGACCA GCAGGAATCG AGTGTGATAC AAGGGACGAC AGTTTACGAA 5100 

TCCGGTGTAT CTTCTCATGA CCAGTCGCAG CATTATAATC GGAACTATAT TCCGGTGAAC 5160 

CTTCTTGAAC CGAATCAGCA ATTCTCCGGC CAAGACCAAC CTCCTCTTCA ACTTGTGTAA 5220 . 

CTCAAAACAT GATAACTTGT TTCTTCCCCT CATAACGATT AAGAGAGAGA CGAGAGAGTT 5280 

i 

CATTTTATAT TTATAACGCG ACTGTGTATT CATAGTTTAG GTTCTAATAA TGATAATAAC 5340 

AAAACTGTTG TTTCTTTGCT TAATTAGATC AACATTTAAA TCCAAAGTTC TAAAACACGT 5400 

CGAGATCCT^ AGTTTGTCAT ACAAGATTAG ACGCATACAC GATCAGTTAA TAGATTTTAA 54 60 

GTGCCTTTTA ATATTTACAT ATAGTTGCAG CTTCGATTAG ATCATGTCCA. CCAAACACTC 5520 

ACAATTAGAG ACAAGCAA7VA CTATAAACAT TGATCATAAA ATGATTACAA CATGTCCATA 5580 

AATTAATTAT GGATTACAAA AATAAAAACT TACAAAAGAT CT 5622 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

; (A) LENGTH: 6138 base pairs 
(B) TYPE: nucleic acid 
(G) STRANDEDNESS : unknown 
(D) TOP.OLOGY: unknown 



(ix) FEATURE: 

(A) NAME/KEY: inisc__f eature 

(B) LOCATION: 1..6138 

(D) OTHER INFORMATION: /label= AGL5_proinoter 

/note= "Nucleotide sequence of the AGL5 promoter." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GAATTCGTAA CAGAATTTAG TGAATAATAT TGTAATTACC AGGCAAGGAC TCTCCAAACG 60 

GATAGCTCGA ATATCGTTAT TAAAGAGTAA ATGATCCAAT ATGTAAGCCA TTGTTGATCA 120 
TCTAACATTG TTGGACTCTC TATTGCTCGA AATGATGCAT ACCTAATCAT TTATTCAGTT . 180 

AACTATCAAG TTGCATTTGT AAAAACCAAA CATTTAAATT CAGATTTGAT ATCACTTACA 240 

GAGGATAGAG AAGCATGACT CCAGGCCTGC ATGCAACAAG AAAAAGGAAG AAAATAATGT 300 

TAAAAATTTG ACAAATATAG TGTTTATTTT TATTATATGA GACAGAATTT GAATAAAATC 360 
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CTACCCAACT 


AGAGCATCAA 


AACGTTTTGC 


AATCGCAATA 


ATGAAACCCA 

■ M ^k^A AA AA A A A 


TTTTCTTTTT 

^A 4v ^A 


420 


GAGTTTTTAC 


TCTTCTTTCA 


ACAGAAACTT 


TCTCAAACGT 

^M ^» ^I^M MM MM M ^tt^ 


CTTTAGCACT 


GTGACGTTAG 


.480 


.ATATATACAC 


AAAAGCTTGA 


AATTTCTTCA 


AGCAAAAGAA 


TCTTTGTGGG 


AGTTAAGGCA . 

m A^^ A A A AA < 


540 

M^ A 


ACAAGCCAGG 


TAAAGAATCT 


CCAACGCATT 


GTTACGTTTT 


CATGAACCTA 


TTTATTATAT 

Mi A A4mA *#*AAAi^ 


600 


GTTCTAAGAA 


AGAAAAAAAT 


ATCTCAAAGT 


AAACGTTGGA 


AATTTTCTGA 

• mA a iA a a 4» ^M*^ A 


TGAAGGGAAA 


660 


TCCAAAGTCT 


TGGGTTTAGT 


ATCCCTATGA 


ATGGTATTTG 

A 4k 4b ^Wi 4i a Jh 4k * Jk 


GAATATGTTT 


TCGTCAAAAC 


720 


AAAAGATTCT 


TTTCTTTTTC 


•ACAAGAGTTA 

M Md A^^A M^^ 4k ^ * M 


GTGATCAATA 

* « A ^^A AA^ a AA 


ACTTATGCAC 


TAATTAATGA 

X t^t^ X X AAf^ X ^^7<X 


780 


GATTGGACGT 


ATACACAATT 


1 

T GAT TATGAT 


ACTTGAGTAA 


AAATCACCTG 


TPCTTTAATT 

X w w X X_X/V^X X 


840 


TGGAAATCTC 


1 

TCTTTCTTAC 


CCATTTATAT ■ 

^^^^«X X X Xs^X«AA 


AC T AC T TCT T 

A^S^ X S»\^ X X w X X 


TTPATTAAAA 


TTAAATTTPA 

X X / XXX \^f\ 




ATTATPAATr 


ATCGTTCAAT 


TTGATAAAGA 


TTTAAPATTT 
X X xc\tW^r\x X X 


TTTGTPAPan 
X X 1 u X ^nv,rnVj 


Gf^PTAGT AAA 




AGPAATPTTT 
ns3\^nt\x \mi X X X 


APATAATTPA 


TPTTTPTTAP 

X\-<X X Xv«X X ^^\.« 


AT AT AT AT AT 
r\ Xnxnxrvxrix 


TAPPTTTTTP 


TTPATTAf^TZi 
X X Wri X X rVvj 1 A 

* 


JL U 


TTPTATTTf^ A 
X X\^ xt\x X X 


TTATf^ATT AT 
X X t\ X X X X 


TTTGTP AT AA 

X X XOXXfAXXUa 


AGPTAGTAAA 


TTAA APAPTP 


/^ATATnAnazi 

V3/i i n i onva/iri 


lUDU 


TT AT ATT APT 

X X t\ Xt\X X t\\^ X 


TrAPGPTAAT 

X ^^nV^O^^ X An x 


TAAPTPTTAA 
X fVi^ X w X xnr\ 


PAPAAPAAGA 


APTAGTGPAT 


Ml i l^Mn^ ill 


X ± H U 


PAAAGPATAT 


APT AT AT ATT 
X r\ X r\ x x x 


GAGAATATAG 


APPAPGAAAG 


TPAATPAAAA 


GAPPTAPPAG 


1 9nn 


PTrTPATPAA 


GTTPTTTPTT 

wX XVa»X X XV^X X 


GAAATGATTT 


TGPAGAATTT 
X vswnunA x x x 


PP A APTT A AT 


TAATTPGAPA 


1 ^ DU 


TGAATGTGAA 


AATGTGTGTT 

CU^X \3 X \3 X V3 X X 


GCTCGTTAAG 


AAAATTGAAT 


AGAAGTAPAA 


TGAAAATGAT 
X vs/wnrv x u a x 




GAGGAATGGG 


CAAAACACAA 


AAGAGT TT CC 


TTTPGTAAPT 


APAATTAATT ■ 
c\\^r\t\ X X r\f\ X X 


AATGPAAATP 
nn X wV^^v^n x \^ 


X w 


TGAGAAAGGG 


TTCATGGATA 


ATGACTACAC 


AC AT GAT TAG 


TPATTPPPPG 


TGGGPTPTPT 


14 40 


GCTTTCATTT 


AC T T TAT TAG 


TTTCATCTTC 

XXX X X X 


TCTAATTATA 

X ^ X X X X rx 


TTGTPGPATA 

X X V7 J. \^ VjN^n X n 


TATGATGPAG 


1500 


TTCTTTTGTC 


TAAATTACGT 


AATATGATGT 


AAT T AAT TAT 

X X«U»X X X 


CAAAATAAAT 


ATTCAAATTG 

X X \^CU^C^ X X W 


1560 


CCGTTGGACT 


AACCTAATGT 


CCAAGATTAA 


GACTTGAACA 


T AAG AAT T T T 

X «M*^7»Ji*^ X X X X 


GGAAAAACTA 


1620 

A w mt V 


AACCAGTTAT 


AATATATACT 


CTTTW^TTGC 


CATTTCTGAA 


CACAACCAAA 


TAATAATATA 

X aU* X •V* X *X X 


1680 

A W V w 


TACTATTTAC 


AGTTTTTTTT 


AATTGGCAAG 


AACACTGAAA 


TCTTATTCAT 


TGTCTCGCTT 


1740 

r ^A 


GGTAGTTGAC 


AAGTTATAAC 


ACTCATATTC 


ATATAACCCC 


ATTCTAACGT 


TGACGACGAA 

X ^^A A%^^^4 A^^ A* A 


1800 


CACTCATATA 


AACCACCCAA 


ATTCTTA;GCA 


TATTAGCTAA 


ATATTGGTTT 


AATTGGAAAT 

A 4a a a a 4tA AA 4 A 


1860 


ATTTTTTTTA 


TATATAAAAT 


GCCAGGTAAA 


TATTAACGAC 


ATGCAATGTA 


TATAGGAGTA 


1920 

^m ttfl W 


GGGCAATAAA 


AAGAAAAGGA 


GAATAAAAAG 


GGATTACCAA AAAAGGAAAG 


TTTCCAAAAG 


1980 


GTGATTCTGA 


TGAGAAACAG 


AGCCCATACC TCTCTTTTTT 


CCTCTAAACA 


TGAAAGAAAA 


2040 


ATTGGATGGT 


CCTCCTTCAA 


TGCTCTCTCC 


CCACCCAATC 


CAAACCCAAC 


TGTCTTCTTT 


2100 


CTTTCTTTTT 


TCTTCTTTCT 


AATTTGATAT 


TTTCTACCAC 


TTAATTCCAA 


TCAATTTCAA 


2160 


ATTTCAATCT 


AAATGTATGC 


ATATAGAATT 


TAATTAAAAG 


AATTAGGTGT 


GTGATATTTG 


2220 


AGAAAATGTT 


AGAAGTAATG 


GTCCATGTTC 


TTTCTTTCTT 


TTTCCTTCTA 


TAACACTTCA 


2280 
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GTTTGAAAAA AAACTACCAA ACCTTCTGTT TTCTGCAAAT GGGTTTTTAA ATACTTCCAA 2340 

AGAAATATTC CTCTAAAAGA AATTATAAAC CAAAACAGAA ACCAAAAACA AAAAATAAAG 2400 

TTGAAGCAGC AGTTT^AGTGG TACTGAGATA ATAAGAATAG TATCTTTAGG CCAATGAACA 24 60 

AATTAACTCT CTCATAATTC ATCTTCCCAT CCTCACTTCT CTTTCTTTCT GATATAATTA 2520 

ATCTTGCTAA GCCAGGTATG GTTATTGATG ATTTACACTT TTTTTTAAAA GTTTCTTCCT . 2580 

TTTCTCCAAT CAAATTCTTC AGTTAATCCT TATAAACCAT TTCTTTAATC CAAGGTGTTT 2640 

GAGTGCAAAA GGATTTGATC TATTTCTCTT GTGTTTATAC TTCAGCTAGG GCTTATAGAA ' 2700 

ATGGAGGGTG GTGCGAGTAA TGAAGTAGCA GAGAGCAGCA AGAAGATAGG GAGAGGGAAG 2760 

ATAGAGATAA AGAGGATAGA GAACACTACG AATCGTCAAG TCACTTTCTG CAAACGACGC 2820 

AATGGTTTAC TCAAGAAAGC TTATGAGCTC TCTGTCTTGT GTGACGCTGA GGTTGCTCTT 2880 

GTCATCTTCT CCACTCGAGG CCGTCTCTAC GAGTACGCCA ACAACAGGTA CACATCTTTT 2940 

AGCTAGATCT TGATTTTGTT GAATTTTTTT TCTAGAATAA AGTTTCGACT CTTCTGGTGG 3000 

GTTTTTCAAT CTTTATGGTC TCTTTATAGT TTTTTTCCTT AGTTTCTCTG AAGCTCAAAT 3060 

CTCTTTAAAA ATCCCCAAAA TTAGGGTTTG TTTAAAACTA GGGAACCCTA CTTTAACTTC 3120 

TTTCTCTTAG TAAAAAAGCA GTGAGGGTCT TCTCTGATCA TTAATTAGCA TCCCCCATAC 3180 

CTTGTTCCAG TCACTTTTTC TCCACAAATC CTTATAACAG TATCTATATA TGTATCTATT 3240 

TATGTCAGTT TGTACAAGAC ACTTCGATCA ATTTGATGAC CCATCAAGTT TTATTTCTGC 3300 

AGATTGATCA TTAGGTTTCC ATCATAGTAA TGAAAAAGTA GGGTTCTTGA TAAAATTATA 3360 

ATAATATATA TTATTTGGCT ATATAAAAAA GCTATGTAGA TTCCTTAAAA ATTGATTCAC 3420 

TAGGGAGAGA CTAGTAGGTG TTTGTCTTCT GACACTTCTC TAATCTTTTG GTGAATCCTT 3480 

TTGTTAAATC AAGAAAATGA ATCAGGGACA AAGCTTATTG TTGAGTCACT TAATTAATCA 3540 

TCCGATCCAT CAATCAAGAA AAATAACGAA ACAGAAAATT TTGATTTTTG ATTGTTATTT 3600 

TCTCCACTTC AAGTTGGGGA CTTGTCATTT CCGTTTTTCT ATACGTTTCC AGCTATTAAC 3660 

AGCTCATGTT CATTTCACCA TTTTGATTAT TTGTCTGCTT TTTAAAGATA AATGTTTTCA 3720 

AAAATATTGT TTTTATTTGC TTGGCTAGTT AATACTATAA TTGAGGTTGA TGTATGACTA 3780 

TAATCTATAA GTCAAGTCTC ATATCATGGA TCTAAGTTAA AACTAGTAAA TTTGTAGTTT 3840 • 

CAATGTGAAC TTTCACAACG ACTAAAGAAC TGATCTGAAG TTTATAATGG ACATGACTAA 3900 

TTTGATTAAC AAAAGAGGAA TGCATTATGT ATGTAGAAAC ATGTGATATA TATATGTTTC 3960 

TATTATCAAA AGTGTAGTTA ACTTTCTTAT TTCAAACACC CTCATGCTTT AGTAGTATCT 4020 

TACTTTTGAC ATTTCTCAAC TTCAGCTTTC CATTATACAA CAGCACAATG TAAATTACTT 4080 

GTATATGAAT ATGAAAGCAT AACGTTATGC AAAGATTTCT AGCTTTTCTT TTTCTGTTTT 4140 

GCAAAAGATT TACAAATATC ATGTTCTTGG TAAAAACATA CTTGCCTCAG CCACATATGC 4200 
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ATGTAAATGT 


AATGTTCAAA 


TATTAATTCA 

< 


GGAAAAACAA 


AGAAGAAGCA 


AAATTAGCTT 


4260 


CTAGAGTAGG 


GAATCTATTG 


ACTTGACCTG 


AAAATCACTT 


CTTTTTCTTA 


AAGCCTAGTA 


4320 


GTGAATTTTT 

> 


TAATCTAATT 


AGGCCAAAAT 


ATATACTAGC 


CTAAAATATA 


ATTTGGATTT . 


4380 


TGTGTCGTAC 


ATAAATTGGG 


ACCAATTCCA ATTAACTAAG 


AGCATATGCA 


ATTCAAATTC 


4440 


TTTTTATTTT 


CTTCTCCGAT 


TTGCTACTTC 


TTTCTTTTGT 


ATGTTTTCAA 


ATTAGGATTA 


4500 


CACTTTTTTG 


GGGAAGTACA 


CATTAGGGTC 


TTCTCGAACT 


TTGATTATAC 


ATATATATAT 


4560 


ATATATATAT 


ATATAACTTT 


. GTGAGATGTC 


ACTGTTAATA 


GATAATAGGC 


AATAACAATA 


. 4 620 


ATATCCAAAA 


AAGAAGGCGC 


AAACAAATCA 


TATACTATAT 


GGTACTGGTC 


CATTCACTAT 


4680 


TTTGTCGGTT 


GAATTTAAGG 


TTTGGCGTAC. AAACTTTGTT 


TCAAACCTTT 


ATTATTCCGT 


4740 


CTTTCTGTGT 


GTTTTGTATA 


TCCAGAAGAT 


AAAAATATCA 


ATTTCTTTAA 


CGACTTCATA 


4800 


TATATATATA 


TATATATATA 


TATATATATT 


TTTCTCTTCT 


GGTTTTAGTG 


TTTGAATCCA 


4860 


ACAGTTATAG 


TTTCGTGTGT 


CTTTGTTTTA 


CTTGTGGTGG 


TTTAAGTTTG 


AGATTTTCAC 


4920 


CGATTGCATC 


TATTTACATA 


TATAGCTACC 


ACAAAAAAGA 


TTGCATTTTA 


AAATCTTTTC 


4980 


CTTTGTGTGA 


ATGTTGATGA 


AGTGTGAGAG 


GAACAATAGA 


AAGGTACAAG 


AAAGCTTGCT 


5040 


CCGACGCCGT 


TAACCCTCCG 


ACCATCACCG 


AAGCTAATAC 


TCAGGTTAGC 


TTTTAATTAA 


5100 


TACACCTAGC 


TAGCTAGTTC 


GTTAATTACT 


TAATTTCTTC 


TTCTTTTAGT 


TATCTGACCT 


5160 


TTTTTTCACC 


TCTTGTAACA 


ATGATGGGAT 


CGAAATTGAT 


GAAGTACTAT. 


CAGCAAGAGG 


5220 


CGTCTAAACT 


CCGGAGACAG 


ATTCGGGACA 


TTCAGAATTT 


G7\ACAGACAC 


ATTCTTGGTG 


5280 


AATCTCTTGG 


TTCCTTGAAC 


TTTAAGGAAC 


TCAAGAACCT 


TGAAAGTAGG 


CTTGAGAAAG. 


5340 


GAATCAGTCG 


TGTCCGATCC 


AAGAAGGTAC 


ATCACTAACT 


CTCCATCAAT 


CTCCTTATCA ' 


5400 


TTGAATATAT 


ATCCATCTGA 


TTCTTGCCCG 


TTATATTTGG 


TTTTTCTCTC 


CAGCACGAGA 


5460 


TGTTAGTTGC 


AGAGATTGAA 


TACATGCAAA 


AAA6GGTAAA 


AGTAAAACCT 


ATCTTCCTTC 


5520 


ACAATGAACT 


ACCCCTACTT 


TATTAGCAAC 


TTCTCTTTCT 


GATGATCATC 


TTTTTTATTT 


5580 


TCTGTTGTCG 


CTTGCATTGT 


AGGAT^TCGA 


GCTGCAAAAC 


GATAACATGT 


ATCTCCGCTC 


5640 


CAAGGTTTTA 


TACATAACTC 


TTTTTGGCAT 


TTTTGATCAT 


CATTTTTTTC CGGTAGACAA 


5700 


TCTCTTGATG 


TGCAAATTCT 


AAATATCTCT 


GCAGATTACT 


GAAAGAACAG 


GTCTACAGCA 


5760 






ATCAAGGGAC 


AGTTTACGAG 


TCGGGTGTTA 


CTTCTTCTCA 


C O O A 

5o20 


CCAGTCGGGG 


CAGTATAACC 


GGAATTATAT 


TGCGGTTAAC 


CTTCTTGAAC 


CGAATCAGAA 


5880 


TTCCTCCAAC 


CAAGACCAAC 


CACCTCTGCA 


ACTTGTTTGA 


TTCAGTCTAA 


CATAAGCTTC 


5940 


TTTCCTCAGC 


CTGAGATCGA 


TCTATAGTGT 


CACCTAAATG 


CGGCCGCGTC 


CCTCAACATC 


6000 


TAGTCGCAAG 


CTGAGGGGAA 


CCACTAGTGT 


CATACGAACC 

r 


TCCAAGAGAC 


GGTTACACAA 


6060 


ACGGGTACAT 


TGTTGATGTC 


ATGTATGACA 


ATCGCCCAAG 


TAAGTATCCA 


GCTGTGTTCA 


6120 
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GAACGTACGT CCGAATTC 6138 
(2) INFORMATION FOR SEQ ID NO: 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 896 base pairs 

(B) TYPE: nucleic acid , • 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear ' ' 

(ii) MOLECULE. TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION.: 7.. 753 

(ix) FEATURE: . 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 896 

(D) OTHER INFORMATION:. /note= "There is a poly (A) tail at 
the end of the cDNA sequence." 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..896 

(D) OTHER INFORMATION: /note= "AGLl cDNA and deduced 
protein sequences." 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GGATCA ATG GAG GAA GGT GGG AGT AGT CAC GAC GCA GAG AGT AGC AAG 48 
Met Glu Glu Gly Gly Ser Ser His Asp Ala Glu Ser Ser Lys 

1 . 5 . 10 



AAA CTA GGG AGA GGG AAA ATA GAG ATA AAG AGG ATA GAG AAC ACA ACA 
Lys Leu Gly Arg Gly Lys lie Glu lie Lys Arg lie Glu Asn Thr Thr 
15 20 25 30 



96 



AAT CGT CAA GTT ACT TTC TGC AAA CGA CGC AAT GGT CTT CTC AAG AAA 
Asn Arg Gin Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys 

35 40 45 



144 



GGT TAT GAA CTC TCT GTC TTG TGT GAT GCC GAA GTT GCC CTC GTC ATC 
Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Val He 

50 55 60 



192 



TTC TCC ACT CGT GGC CGT CTC TAT GAG TAG GCC AAC AAC AGT GTG AGG 
Phe Ser Thr Arg Gly Arg Leu Tyr Glu Tyr Ala Asn Asn Ser Val Arg 
65 70 75 



240 



GGT ACA ATT GAA AGG TAC AAG AAA GCT TGT TCC GAT GCC GTC AAC CCT 
Gly Thr He Glu Arg Tyr Lys Lys Ala Cys Ser Asp Ala Val Asn Pro 
80 85 ■ ' 90 



288 



CCT TCC GTC ACC GAA GCT AAT ACT CAG TAC TAT CAG CAA GAA GCC TCT 
Pro Ser Val Thr Glu Ala Asn Thr Gin Tyr Tyr Gin Gin Glu Ala Ser 
95 100 105 110 



336 
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AAG CTT CGG AGG CAG ATT CGA GAT.ATT CAG AAT TCA AAT AGG CAT ATT 384 
Lys Leu Arg Arg Gin He Arg Asp lie Gin Asn Ser Asn Arg His lie 

115 120 125 

GTT GGG GAA TCA CTT GGT TCC TTG AAC TTC AAG GAA CTC AAA AAC CTA 432 
Val Gly Glu Ser Leu Gly Ser Leu Asn Phe Lys Glu Leu Lys Asn Leu 

.130 135 140 

GAA GGA CGT CTT GAA AAA GGA ATC AGC CGT GTC CGC TCC AAA AAG AAT, 4 80 

Glu Gly Arg Leu Glu Lys Gly He Ser Arg Val Arg Ser Lys Lys Asn 
145 150 155 

GAG CTG TTA GTG GCA GAG ATA GAG TAT ATG CAG AAG AGG GAA ATG GAG 528 
Glu Leu Leu Val Ala Glu He Glu Tyr Met Gin Lys Arg Glu Met Glu 
160 165 170 

TTG CAA CAC AAT AAC ATG TAC CTG CGA GCA AAG ATA GCG GAA GGC GCC 576 
Leu Gin His Asn Asn Met Tyr Leu Arg Ala Lys He Ala Glu Gly Ala 
175 180 185 190 

AGA TTG AAT CCG GAC CAG CAG GAA TCG AGT GTG ATA CAA GGG ACG ACA • 624 

Arg Leu Asn Pro Asp Gin Gin Glu Ser Ser Val He Gin Gly Thr Thr 

195 200 205 

GTT TAC GAA TCC GGT GTA TCT TCT CAT GAC CAG TCG CAG CAT TAT AAT 672 
Val Tyr Glu Ser Gly Val Ser Ser His Asp Gin Ser Gin His Tyr Asn 

210 215 220 

CGG AAC TAT ATT CCG GTG AAC CTT CTT GAA CCG AAT CAG CAA TTC TCC 720 
Arg Asn Tyr He Pro Val Asn Leu Leu Glu Pro Asn Gin Gin Phe Ser 
225 230 235 

GGC CAA GAC C7^ CCT CCT CTT CAA CTT GTG TAACTCAAAA CATGATAACT 770 
Gly Gin Asp. Gin Pro Pro Leu Gin Leu Val 
240 ,245 

TGTTTCTTCC CCTCATAACG ATTAAGAGAG AGACGAGAGA GTTCATTTTA TATTTATAAC 830 

GCGACTGTGT ATTCATAGTT TAGGTTCTAA TAATGATAAT AACAAAACTG TTGTTTCTTT 890 

GCTTCA 896 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 248 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear ' 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Glu Glu Gly Gly Ser Ser His Asp Ala Glu Ser Ser Lys Lys Leu 
1 .5 10 15 

Gly Arg Gly Lys He Glu He Lys Arg He Glu Asn Thr Thr Asn Arg 

20 .25 30 



Gin Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr. 
35 40 45 
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Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Val lie Phe Ser 
50 ' . 55 . 60 

Thr Arg Gly Arg Leu Tyr Glu Tyr Ala Asn Asn Ser Val Arg Gly Thr 
65 70 75 .80 

♦ 

lie Glu Arg Tyr Lys Lys Ala Cys Ser Asp Ala Val Asn Pro Pro Ser 

85 90 95 

Val Thr Glu Ala Asn Thr Gin Tyr Tyr Gin Gin Glu Ala Ser Lys Leu 

100 .105 110 

Arg Arg Gin lie Arg Asp lie Gin Asn Ser Asn Arg His lie Val Gly 
115 120 125 

Glu Ser Leu Gly Ser Leu Asn Phe Lys Glu Leu Lys Asn Leu Glu Gly 
130 135 . 140 

Arg Leu Glu Lys Gly lie Ser Arg Val Arg Ser Lys Lys Asn Glu Leu 
145 150 155 160 

* 

Leu Val Ala Glu He Glu Tyr Met Gin Lys Arg Glu Met Glu' Leu Gin 

165 170 175 

His Asn Asn Met Tyr Leu Arg Ala Lys He Ala Glu Gly Ala Arg Leu 

180 185 190 

Asn Pro Asp Gin Gin Glu Ser Ser Val He Gin Gly Thr Thr Val Tyr 
195 200 205 

> 

Glu Ser Gly Val Ser Ser His Asp Gin Ser Gin His Tyr. Asn Arg Asn 
210 215 220 

Tyr He Pro Val TVsn Leu Leu Glu Pro Asn Gin Gin Phe Ser Gly Gin 

225 ■ 230 235 240 

Asp Gin Pro Pro Leu Gin Leu Val 

245 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 78.. 818 . 

(ix) FEATURE: 

(A) NAME/KEY: niisc_f eature 

(B) LOCATION: 1. .959 

(D) OTHER INFORMATION: /note« "AGL5 cDNA and deduced 
protein sequences . " 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GAATTCATCT TCCCATCCTC ACTTCTCTTT CTTTCTGATC ATAATTAATC TTGCTAAGCC 60 

AGCTAGGGCT TATAGAA ATG GAG GGT GGT GCG AGT AAT GAA GTA GCA GAG 110 

Met Glu Gly Gly Ala Ser Asn Glu Val Ala Glu 
1 5 . • 10 

. ■ « • 

AGC AGC AAG AAG ATA GGG AGA GGG AAG ATA GAG ATA AAG AGG ATA GAG 158 
Ser Ser Lys Lys lie Gly Arg Gly Lys lie Glu lie Lys Arg lie Glu 

15 20 25 

AAC ACT ACG AAT CGT CAA GTC ACT TTC TGC AAA CGA CGC AAT GGT TTA 206 
Asn Thr Thr Asn Arg Gin Val Thr Phe Cys Lys Arg Arg Asn Gly Leu 
30 35 . 40 

CTC AAG PiPJK GCT TAT GAG CTC TCT GTC TTG TGT GAC GCT GAG GTT GCT 254 
Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala 
45 50 . 55. 

CTT GTC ATC TTC TCC ACT CGA GGC CGT CTC TAC GAG TAC GCC AAC AAC 302 
Leu Val lie Phe Ser Thr Arg Gly Arg Leu Tyr Glu Tyr Ala Asn Asn 
60 65 .70 75 

AGT GTG AGA GGA ACA ATA GAA AGG TAC AAG AAA GCT TGC TCC GAC GCC 350 
Ser Val Arg Gly Thr lie Glu Arg Tyr Lys Lys Ala Cys Ser Asp Ala 

. 80 85 90 

GTT AAC CCT CCG ACC ATC ACC GAA GCT AAT ACT CAG TAC TAT CAG CAA 398 
Val Asn Pro Pro Thr lie Thr Glu Ala Asn Thr Gin Tyr Tyr Gin Gin 

95 100 105 

GAG GCG TCT AAA CTC CGG AGA CAG ATT CGG GAC ATT CAG AAT TTG AAC 44 6 

Glu Ala Ser Lys Leu Arg Arg Gin lie Arg Asp lie Gin Asn Leu Asn 
•110 115 120 

AGA CAC ATT CTT GGT GAA TCT CTT GGT TCC TTG AAC TTT AAG GAA CTC 4 94 

Arg His lie Leu Gly Glu Ser Leu Gly Ser Leu Asn Phe Lys Glu Leu 
125 130 135 ■ 

AAG AAC CTT GAA AGT AGG CTT GAG AAA GGA ATC AGT CGT GTC CGA TCC 542 
Lys Asn Leu Glu Ser Arg Leu Glu Lys Gly lie Ser Arg Val Arg Ser 
140 145 150 155 

AAG AAG CAC GAG ATG TTA GTT GCA GAG ATT GAA TAC ATG CAA AAA AGG 590 
Lys Lys His Glu Met Leu Val Ala Glu He Glu Tyr Met Gin Lys Arg 

160 165 170 

GAA ATC GAG CTG CAA AAC GAT AAC ATG TAT CTC CGC TCC AAG ATT ACT 638 
Glu lie Glu Leu Gin Asn Asp Asn Met Tyr Leu Arg Ser Lys He Thr 

175 180 185 



GAA AGA ACA GGT CTA CAG CAA CAA GAA TCG AGT GTG ATA CAT CAA GGG 686 

Glu Arg Thr Gly Leu Gin Gin Gin Glu Ser Ser Val lie His Gin Gly 
190 195 200 

ACA GTT TAC GAG TCG GGT GTT ACT TCT TCT CAC CAG TCG GGG CAG TAT 734 

Thr Val Tyr Glu Ser Gly Val Thr Ser Ser His Gin Ser Gly Gin Tyr 

205 210 215 
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AAC CGG AAT TAT ATT GCG GTT AAC CTT CTT GAA CCG AAT CAG AAT TCC 782 
Asn Arg Asn Tyr lie Ala Val Asn Leu Leu Glu Pro Asn Gin Asn Ser 
220 ' 225 230 235 

TCC AAC CAA GAC CAA -CCA CCT CTG CAA CTT GTT TGATTCAGTC TAACATAAGC 835 
Ser Asn Gin Asp Gin Pro Pro Leu Gin Leu Val 

240 245 

TTCTTTCCTC AGCCTGAGAT CGATCTATAG TGTCACCTAA ATGCGGCCGC. GTCCCTCAAG 895 

ATCTAGTCGC AAGCTGAGGG GAACCACTAG TGTCATACGA ACCTCCAAGA GACGGTTACA 955 

CAAA . • 959 



(2) INFORMATIQN FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 24 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Glu Gly Gly Ala Ser Asn Glu Val Ala Glu Ser Ser Lys Lys lie 
1-5 10 15 

Gly Arg Gly Lys lie. Glu lie Lys Arg lie Glu Asn Thr Thr Asn Arg 

20 25 .30 

Gin Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 

35 . 40 ■ • 45 

Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Val lie Phe Ser 
50 .55 60 

Thr Arg Gly Arg Leu Tyr Glu Tyr Ala Asn Asn Ser Val Arg Gly Thr 
65 ■ 70 . 75 80 

lie Glu Arg Tyr Lys Lys Ala Cys Ser Asp Ala Val Asn Pro Pro Thr 

85 .90 95 

lie Thr Glu Ala. Asn Thr Gin Tyr Tyr Gin Gin Glu Ala Ser Lys Leu 

100 105 110 

Arg Arg Gin lie Arg Asp He Gin Asn Leu Asn Arg His He Leu Gly 
115 120 125 

Glu Ser Leu Gly Ser Leu Asn Phe Lys Glu Leu Lys Asn Leu Glu Ser 
130 135 140 

Arg Leu Glu Lys Gly He Ser Arg Val Arg Ser Lys Lys His Glu Met 
145 150 155 160 

Leu Val Ala Glu He Glu Tyr Met Gin Lys Arg Glu He Glu Leu Gin 

165 . 170 175 

Asn Asp Asn Met Tyr Leu Arg Ser Lys He Thr Glu Arg Thr Gly Leu 

180 185 190 
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Gin Gin Gin Glu Ser Ser Val lie His Gin Gly Thr Val Tyr Glu Ser 
195 200 205 

GLyVal Thr Ser Ser His Gin Ser Gly Gin Tyr Asn Arg Asn Tyr He 
210 215 220 

Ala Val Asn Leu Leu Glu Pro Asn Gin Asn Ser Ser Asn Gin Asp Gin 

225 230 '235 240 

Pro Pro Leu Gin Leu Val 

245 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: . 

(A) NAME/KEY: misc__f eature 

(B) LOCATION: 1..27 

(D) OTHER INFORMATION: /note* "Primer AGL8 5-4" 

r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CCGTCGACGA TGGGAAGAGG TAGGGTT 27 

(2) INFORMATION -FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..20 

(D) OTHER INFORMATION: /note- "Primer 0AM14." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AATCATTACC AAGATATGAA 20 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



I 

wo 99/00502 



PCTAJS98/13208 



93 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



CGGATAGCTC GAATATCG 



18 



(2) INFORMATION FOR SEQ ID N0:12: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base, pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



AACATTGCGT CGTTTGC 



17 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GTAATTACCA GGCAAGGACT .CTCC 24 
(2) INFORMATION -FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GTCATCGGCG GGGGTCATAA CGTG * 24 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

. (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:15: 
GAGGATAGAG AACACTACGA ATCG 24 



I 
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(2) INFORMATION FOR SEQ TtD N0:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0':16: 
CAGGTCAAGT CAATAGATTC 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID.N0:17 
CAGAATTTAG TGAATAATAT TG 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18 
GCCAGAGATA ATGCTATTCC 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
CATTGATCCA TATATGACAT CAC 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
GTGATGTCAT ATATGGATCA ATGGGAAGAG GTAGGGTTCA G 
(2) INFOI^TION FOR SEQ ID N0:21: 

■ 

(i) iSEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single. 

(D) TOPOLOGY: linear 

. (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 

CAAGAGTCGG TGGAATATTC G 

(2) INFORMATION FOR SEQ ID NO:22:.' 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 base pairs 
' (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



41 



21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
CGAATATTCC ACCGACTCTT GGTACGCTTC TCCTACTCTA T 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



41 



(xi) SEQUENCE. DESCRIPTION: SEQ ID NO:23: 
CTAATAAGTA AGATCGCGGA A 
(2) INFORMATION. FOR SEQ ID-NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single - 

(D) TOPOLOGY: linear 



21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TTCCGCGATC TTACTTATTA GCATGGAGAG GATACTTGAA C 



41 
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We claim: 

1- A non-naturally occurring seed plant, 
comprising an ectopically expressed nucleic acid molecule 
encoding an AGL8-like gene product, said seed plant 
5 characterized by delayed seed dispersal. 

• ■ ' - 

2. The non-naturally occurring seed plant . of 
claim 1, wherein said AGL8-like gene product has 
substantially the amino acid sequence of an AGL8 
ortholpg. 

10 '3. The non-naturally occurring seed plant of 

claim 2, wherein said AGLB-like gene product has the 
amino acid sequence of Arabidopsis AGL8 (SEQ ID N0:2) . 

4. The non-naturally occurring seed plant of 
claim 3, which is a transgenic seed plant. 

15 5. The transgenic seed plant of claim 4, 

wherein said ectopically expressed nucleic acid molecule 
encoding an AGL8-like gene product is operatively linked 
to an exogenous regulatory element. 

6. The transgenic . seed plant of claim 5, 
20 wherein said exogenous regulatory element is a 

constitutive regulatory element. 

7. The transgenic seed plant of claim 6, said 
nucleic acid molecule comprising an exogenous nucleic 
acid molecule encoding substantially the amino acid 

25 sequence of an AGL8 ortholog operatively linked to a 
cauliflower mosaic virus 35S promoter. 
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8* "The transgenic seed plant of claim 5, 
wherein said exogenous regulatory element is a dehiscence 
zone-selective regulatory element, 

9. The transgenic seed plant of claim. 8, 

5 wherein said dehiscence zone-selective regulatory element 
is selected from the group consisting of an AGLl 
regulatory element and an AGL5 regulatory element. 

10. The transgenic seed plant of claim 9, 
wherein said nucleic acid molecule encoding an AGL8-like 

10 gene product is an' exogenous nucleic acid molecule • 
encoding substantially the amino acid sequence of an AGL8 
ortholog. 

11. The transgenic seed plant of claim 10, 
wherein said AGL8-like gene product has the amino acid 

15 sequence, of Arabidopsis AGL8 (SEQ ID N0:2) . 

r > 

12. The transgenic seed plant of claim 9, 
wherein said dehiscence-zone selective regulatory element 

» » ^ 

is an AGLl regulatory element comprising at least fifteen 
contiguous nucleotides of a nucleotide sequence selected 
20 from the group consisting of: 

nucleotides 1 to 2599 of SEQ ID N0:3; 

nucleotides 2833 to 4128 of. SEQ ID N0:3; 

nucleotides 4211 to 4363 of SEQ ID N0:3; 

nucleotides 4426 to 4554 of SEQ ID N0:3; 
25 nucleotides 4.655 to 4753 of SEQ ID N0:3; 

nucleotides 4796 to 4878 of SEQ ID N0:3; 

nucleotides 4921 to 5028 of SEQ ID NO: 3; and 

nucleotides 5421 to 5682 of SEQ ID NO: 3. 
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13. The transgenic seed plant of claim 9, 
wherein said dehiscence-zone selective regulatory element 
is an AGL5 regulatory element comprising at least fifteen 

» • * ' • 

contiguous nucleotides of a nucleotide sequence selected 
from the group consisting of: 

nucleotides 1 to 1888 of SEQ ID N0:4; 

nucleotides 2928 to 5002 of SEQ ID NO: 4; 

nucleotides 5085 to 5204 of SEQ ID NO: 4; 

nucleotides 5367 to 5453 of SEQ ID NO: 4; 

nucleotides 5496 to 5602 of SEQ ID N0:4; 

nucleotides 5645 to 5734 of SEQ ID NO: 4; and 

nucleotides 6062 to 6138 of SEQ. ID NO: 4. 

14. The non-naturally occurring seed plant of 
claim 1, which is a dehiscent seed plant. 

15. The non-naturally occurring seed plant of 
claim 14, which is a member of the Brassicaceae, 

16. The non-naturally occurring seed plant of 
claim 14, which is a member of the Fabaceae. 

17. A non-naturally occurring seed plant, in 
which AGLl expression and AGL5 expression each are 
suppressed, said seed plant characterized by delayed seed 
dispersal. 

. 18. The non-naturally occurring seed plant of 
claim 17, which is an agll agl5 double mutant. 
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19. A tissue derived from a non-naturally 
. occurring seed plant, said seed plant comprising an 

ectopically expressible- nucleic acid molecule encoding an 
AGLS-like gene product and characterized by delayed seed . 
5 dispersal. 

■ - . ♦ 

* ■ 

20. The tissue of claim 19, which is a seed.. 

t 

21- A tissue derived from a non-naturally. 
occurring seed plant, in which AGLl expression and AGL5 
expression each are suppressed, said seed plant 
10 characterized by delayed seed dispersal. 

22. The tissue of claim 21, which is a seed. 

23. A method of producing a non-naturally 
occurring seed plant characterized by delayed seed 

■ * > 

dispersal, comprising ectopically expressing a nucleic 
15 acid molecule encoding an AGLS-like gene product in said 
seed plant, whereby seed dispersal is delayed due to 
ectopic expression of said nucleic acid molecule. 

24. A- substantially purified dehiscence 
zone-selective regulatory element, comprising a 

20 nucleotide sequence that confers selective expression 
upon an operatively linked nucleic acid molecule in the 
valve margin or dehiscence zone of a seed plant, 

provided that said dehiscence zone-selective 
regulatory element does not have a nucleotide sequence 

25 consisting of nucleotides 1889 to 2703 of SEQ ID NO: 4. 
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25. The substantially purified dehiscence, 
zone-selective regulatory element of claim 2A, which is 
selected from the group consisting of an AGLl regulatory 
element and an AGL5 regulatory element. 



5 26. ■ The substantially purified dehiscence 

zone-selective regulatory element of claim 25, which is 
an AGLl regulatory element comprising at least fifteen 
contiguous nucleotides of a nucleotide sequence selected 
from the group consisting of: 
10 nucleotides 1 to 2599 of SEQ ID N0:3; 

nucleotides 2833 to 4128 of SEQ ID NO: 3 
nucleotides 4211 to 4363 of SEQ ID NO: 3 
nucleotides 4426 to 4554 of SEQ ID NO: 3 
nucleotides 4655 to 4753 of SEQ ID NO: 3 
15 nucleotides 4796 to 4878 of SEQ. ID N0:3 

nucleotides 4921 to 5028 of SEQ ID NO: 3; and 
. nucleotides 5361 to 5622 of SEQ ID NO: 3. 



27. The substantially purified dehiscence 
zone-selective regulatory element of claim 25, which is 
20 an AGL5 regulatory element comprising at least fifteen 
contiguous nucleotides of a nucleotide sequence selected 
from the group consisting of: 

nucleotides 1 to 1888 of SEQ ID N0:4; 
nucleotides 2928 to 5002 of SEQ ID N0:4; 
25 nucleotides 5085 to 5204 of SEQ ID N0:4; 

nucleotides 5367 to 5453 of SEQ ID N0:4; 
nucleotides 5496 to 5602 of SEQ ID N0:4; 
nucleotides 5645 to 5734 of SEQ ID N0:4; and 
nucleotides 6062 to 6138 of SEQ ID N0:4. 



30 28. A plant expression vector, comprising a 

dehiscence zone-selective regulatory element. 
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29. A kit for producing a transgenic seed 
• plant characterized by delayed seed dispersal, comprising, 
a dehiscence zone-selective regulatory element having a 
nucleotide sequence that confers selective expression 
5 upon an operatively linked nucleic acid molecule in the 
valve margin. or dehiscence zone of a seed plant, 

provided that said dehiscence zone-selective 
regulatory element does not have a nucleotide sequence 

» 

consisting of nucleotides. 1889 to 2703 of SEQ ID NO: .4, 

10 .30. The kit of claim 29, said dehiscence 

zone-selective regulatory element is operatively linked, 
to a nucleic acid molecule encoding an AGL8-like gene 
product. 
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CCCAGAGAGACATAAGAAAGAAAOAGAGAGAGAGATACTT 
TGGTCATTTCAGGGTTGTCGTTTCTCTCTCTTGTTCTTGAGATTTTGAAGAGAGAGAGAT 
1 ATGGGAAGAGGTAGGGTTCAGCTGAAGAGGATAGAGAACTU^GATCAATAGGCAAGTT 
IMG R GRV OLKRIE NKINROVT 

61 TTCTO^GAGAAGGTCTGGTTTGCTO^GAAAGCrCATGAGATCT 

21 F S K R R S G L L K K A H E I S V L C D 

121 GCTGAGGTTGCTCTCATCGTCTTCTCTTCCT^GGCAAACTCTTCGAATAT^ 
41 A EVAL I V F S S KG KL F E Y S T D 

181 TCTTGCATGGAGAGGATACTTGAACGCTATGATCGCTATTTATATTCAGACT^ 
61 S C M E R I L E R Y D R Y L Y S D K Q L 

24 1 GTTGGCCGAGACGTTTCACAAAGTGAAAATTGGGTTCTAGAACATGCTAAGCTCAAG^ 
81VGR DVS QSENW VLE HAKLKA 

301 AGAGTTGAGGTACTTGAGAAGAACAAAAGGAATTTTATGGGGGAAGATCOT 

101 RVEVL EKNKRNF MGEDLDSL 

361 AGCTTGAAGGAGCTCCAAAOCrTGGAGCATCAGCrCGATGC^GCTATCiU^GAGC^ 
121 SLKE LOSLE HOLDAAI KS IR 

421 TCAAGAAAGAACCAAGCTATGTTCGAATCCATATCTGCGCTCCAGAAGAAGGATAAAGCC 
141 S R KN OAM F E S I S ALO K KD K A 

481 TTGCJU^GATCACAACAATTCGCTTCTC^AAAAGATT^ 

161 L Q D H N N S L L K K I K E R E K K T G 

541 Cy^GCAAGAAGGACAATTAGTCCAATGCTCa^^CTCTTCTTCAGTTCTTCTO 
IBlQQ E GQ LVQCSNSSSV LLPQY 

601 TGCGTAACCTCCTCCAGAGATGGCTTTGT6GAGAGAGTTGGGGGAGAGAACGGTGGTGCA 
201C VTS SRD6FVBRV6GBNG G A 

661 TCGTCGTTGACGGAACCAAACTCrcrGCTTCCGGCITGGATGTTACGTCCT 

221 S S LT E P N S L L PAWM L R P T TT 

721 AACXSAGTAGAACTATCTCACrCTTTATAATATAATGATAATATAATTAATGTTT^ 
241 N E * 

781 TTCATAAOVTTCAGCATTTTTTTGGTGACITATACTCATTATTAATACCGATAT^ 
841 GCTAGTCATATTATATGTATGATGGAACTCCGTTGTCGAGACGTATGTACGTAAGCTATC 
901 ATTAGATTCACTGCGTCTTAAGAACT^GATTCyVTATCTTGGTAATGATTTCT 
961 TAn 

FIG. 6 



SUBSTITUTE SHEET (RULE 26) 



wo 99/00502 PCTAJS98/13208 

.7/20 60 

* * * « * ■ ★ "* 

AGATCTGCAA CAGTGAAAAG AGAAAACAAA ATGGACTTGA AGAGGTTTTG ACAATGCCAG 

120 

* * *• *• 
•AGATAATGCT TATTCCCTAA TATGTTGCCA GCCAAGTGTC AAATTGGCTT TTTAAATATG 

180 

GATTTCTGTA TCAGTGGTCA TATITCTOGA TCCAACGTAT TCATCATCAA GTTCTCAAGT 

.240 

* ♦ . * * . * . . * 
•nGCTTTCAG TGCAATTCTA ATTCACACGT TTAACTTTAA CATGCATGTC ATTATAATTA 

300 

* * - * . ♦ " •* * 
CTTCTTCACT AAGACACAAT ACGGCAAACC TTTCAGATTA TATTAATCTC CATAAATGAA 

360 

* *• * * ★ ■ * 

ATAATTAACC TCATAATCAA GATTCAATGT TTCTAAATAT ATATGGACAA AATTTACACG 

420 

* * * * * * 

GAAGATTAGA TACGTATATT AGTAGATTTA GTCTTTCGTT TGTGCGATAA GATTAACCAC 

480 

* ♦ * ♦ * , * 

CTCATAGATA GTAATATCAT TGTCAAATTC CTCTCGGTTT AGTCGCTAAA TTGTATCTTT 

540 

* ♦ * ♦ . ♦ * 
TTTAAGCCTA AAAGTAGTGT ATTCGCATAT GACTTATCGT CCTAACTTTT TTTTTAATTA 

600 

* * ♦ ♦ * 

ACAAAAAAAT CGAAAAGAAA ATAATCTGTT AAATATTTTT TAAGTACICC ATTAACjITTA 

660 

* * ♦ * ♦ - * 

GTTTCTATTT AAAAAATGCT TGAAATTTGA CAGOTATGTT OACAATnt GAATCATGAG 

720 

* * * ■ ♦ ■ ♦ 

CGATGTCTAG ATACTCAGAA TTTAATCAAG ATGTCTTATC AAATITGTTG TCACTCGAGG 

780 

* ♦ * * . ■ ♦ * 
ACCCACGCAA AAGAAAAGAC TAATATGATT TTTATTTGGT CTGGATATTT TTGTAGAGGA 

840 

* ♦ ■ ♦ ■ <* ♦ ' ♦ ■ 
TGAAACTAAG AGAGTGAAAG ATTCGAAATC CAGAATGHC AAGAGAGCTC AAAGCAAAAA 

900 

* * ♦ « ■ « * 
GAAAAATGAA GATGAAGGAC TAAAGAACAA TAAGCAACTA CTTATACCCT ATTTCCATAA 
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* * * . * * * ♦ 
AGG^^rrCAGG.TACTAGGAGA AGTTGAOGCA AGTTtlNNNNN NATTGATTCA AATTTTCATT 

' : . 1020 

TATTTTTACA ATTTAATTCA CCTAAGTTAT TATGCATTTC TCATCATTGG TACATTTTCT 

1080 

* * * * * * - 

GTATAGCGTA TTTACATATA TGAAATAAAT TAAATATGTC CTCACGTTGC AAGTAGTTAA 

1140 

* ■ *. * 

TGAATGTCCC CACGCAAAAA AAAATCCCTC CAAATATGTC CACCTTTTCT TTTCTTTTTA 

1200 

. * * . * * * * 

ATTCCAAAAT TACCATAAAC TTTTGGTTTA CAAAAGATTT CTAGAAATTO AGGAAGATAT 

1260 

* • * . ★ « ■ ♦ * 
CCTAAATCAT TCATGAATCC TTCAATAATC TGAAGTTTGC GATATTTTCG ATTTTCTTCA 

1320 

* ' ♦ * ♦ " - * * 
AGAGTTGCGA TATTTGTAAT TTGGTGACCT TAAACTTTTT ITGATAAAGA GTAAACGITT 

1380 

* ♦ • ♦ - ♦ • ♦ * • 
TTICTTAAAA GTAAAACTTG AnTTATGTT TTAGGGTTCT AGCTCAACTT TGTATTATAT 

1440 

* * . * ■ . * * * 
TTCTTGCAAA AAGAGTTCGT TAACTGCATT CTTCAACACT ATAAAGTGAT TATCAAAAAC 

1500 

* • ♦ * * * 

ATCTTCATGA ACATTAAGAA AAACAATATT TGGTTTCGGT TAGAGCTTGG TTTTGCTTGG 

1560 

* * * * . * * 
CTTGATTCAC ATACCCATTC TAGACTTrGG CATAAATTTG ATACGATAGA GAGTATCTAA 

1620 

* ♦ . * * ♦ . * 
TGGTAATGCA' GAAGGGTAAA AAAAGGAAGA GAGAAAAOGT GAGAAAGATT ACCAAAAATA 

1680 

« _ * '* * * '* ■ 

- AGGAGTTTCA AAAGATGGTT CTGATGAGAA ACAGAGCCCA TCCCTCTCCT TTTCCCCITC 

1740 

* ♦ ■ ♦ ♦ ♦ ■ * 
CCATGAAAGA AATCGGATGG TCCTCCTTCA ATGTCCTCCA CCTACTCTTC TCTTCTTTCT 

1800 

* * - ' ♦ * * * • 

' iTrri ' itTrr cttattatta accatttaat taatttcccc ttcaatttca gtitctagtt 

I860 

^ * A . A ' * A' 
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CTGTAAAAAG AAAATACACA TCTCACTTAT AGATATCCAT ATCTAITTAT ATGCATGTAT 

1920 

( ♦ ♦ ★ ' * ♦ * 

AGAGAATAAA AAAGTGTGAG TTTCTAGGTA TGTTGAGTAT GTGCTGTTTG GACAATTGTT 

1980 

AGATGATCTG TCCATTTTTT TCTTTTTTCT TCTGTGTATA AATATATTTG AGCACAAAGA 

2040 

*..* *. *' ** 

AAAACTAATA ACCTTCTGTT TICAGCAACT AGGGTCTTAT AACXTTICAAA GAAATATTCC 

■ 

2100 

* ■ . ♦ * * * ♦ 

TTCAATTGAA AACCCATAAA CCAAAATAGA TATTACAAAA GGAAAGAGAG ATATTTTCAA 

2160 

* . * ♦ * .* ♦ ♦ 
GAACAACATA ATTAGAAAAG CAGAAGCAGC AGTTAAGTGG TACTGAGATA AATGATATAG 

2220 

* * ♦ * ■ ♦ ■ * 

TTTCTCTTCA AGAACAGTTT GTCATTACCC ACCTTCTCCT TITTGCTGAT CTATCGTAAT 

2280 

* * ♦ ♦ * " * ' 
CtTGAGAACT CAGGTAAGGT TGTGAATATT ATGCACCATT CATTAACCCT AAAAATAAGA 

2340 

* * ♦ ' * ♦ . * 
GAOTTAAAAT AAATGTTTCT TCTTTCTCTG ATTCTTGTGT AACCAATTCA TGGGTTTGAT 

2400 

ATGTTTCTTG GTTATTGCTT ATCAACAAAG AGATTTGATC ATTATAAAGT AGATTAATAA 

2460 

* ■ * ■ ♦ ' * * . * 
CTCTTAAACA CACAAAGTTT CTTTATTTTT TAGTTACATC CCTAATTCTA GACCAGAACA 

2520 

• ♦ ♦ * * ♦ * . . 

TGGATTTGAT CTAtlTCTTG GTTATGTATC TTGATCAGGA AAAGGGATTT GATCATCAAG 

2580 

* ♦ . * ' ♦ ■ ♦ . ♦ . 
ATTAGCCTTC TCTCTCTCTC TCTAGATATC TTTCTTGAAT TTAGAAATCT TTATTTAATT . 

translation o/r^n 
. start . . . 



ATTTGGTGAT GTCATATATG GATCA<\TG3A GGAAGGTGGG AGTAGTCACG ACGCAGAGAG 



2700 
* 



TAGCAAGAAA CTAGGGAGAG GGAAAATAGA GATAAAGAGG ATAGAGAACA CAACAAATCG eXOR 1 

2760 

* » ♦ * * ♦ 

TCAAGTTACT TTCTGCAAAC GACGCAATGG TCTTCTCAAG AAAGCTTATG AACTCTCTGT 
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* * * * ♦ * 

CTTGTGTGAT GCCGAAGTTG CXXTTCGTCAT CTTCTCCACT CGTGGCCGTC TCTATGAGTA 

2880 

* _ * * . * ♦ * 
CGCCAACAAC ^STACGCTT CTCCTACTCT ATTTCTTGAT CnXilTriC T TAATTTTAAC 

2940 

* ♦ * « * • * 

TAAACAAGAT CCTAGTTCAA ATGATAACAA AGTGGGGAIT GAGAGCCAAG ATTAGGGTTT 

3000 

* ■ ♦ * ♦ ■ ♦ ■ * 
GGTTAATTTA GAAAACCAGA TTTCACTTGT TGATACATTT AATATCTCTC TAGCTAGATT 

3060 

* ♦ ■ * * * ♦ . 

TAGTACTCTC TCCTCTATAT ATGTGTGGGT GTGTGTGTAA GTGTGTATAT GTATGCAAAT 

3120 

* ♦ * * ♦ * 

GCAAGAAGAA GAAGAAAAAG TTATCTTGTC TTCTCAAATT CTGATCAGCT TTGACCTTAG 

3180 

* ♦ ♦ ★ * * 

TTTCACTCTT TTTTCTGCAA ATCATTTGAA CCTGATGCAT GTCAGTTTCT ACAATACACT 

3240 

* * * * * ' ♦ 

TTTAATTTTG ACGGCCCATC AAATTTCCTA GGGTTTACTT CAGTGAACAA AATTGGGTTC 

3300 

* ♦ ♦ * * • * 

TTGACACGAT TTAGCATGTA TATATAAAAA TAGGGGATGA TCAAGACTTA TGTAACCTCT 

3360 

* ♦ « * * * 

GTCTGGTGAA ACTAGGGACA AAGTCTACTG ATGAGTTGTC ACTAGGGATC CATTTGATCA 

3420 

* * * ♦ * * 

TTTAATCCCA ACAAAAATGA AACAAAATTT TGAGAATTTA TATGCTGAAG TTTTTCAACC 

3480 

CTCmnTA AATAACTTTA TATTATGTAG ATTTGrATTT AGGGTAATTT GTCCAACTAG 

3540 

* • ♦ * * , * * 
AAGTCCTAAA AATCAATAAA CACACGGATG ACnTGTCTA ACATTGTATC AGTCATCAAA 

3600 

* * ★ ♦ ♦ . * 
TGTAAAATTG TACAAATAAT GAAATTAAAG ATTTAGTCTC TTTTATTTTT TTTGTTTAGG 

3660 

GTGTATATAT ATATATATAT GTATATTTGT TGCATTGATA TATCAATGAG AGGGAGAGAA 

3720 

A A A ^ * ^ 
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CTCAGAGAAG TGTCGGAAAT TAAAATGGTA CGAGCCAATT GGAATCTCTG GCATICTGAG 

3780 

CTTCATTTGT TTGTTATTAG AAAAAAAAAA AAAAAATCCT TTAAAGATAC CTTCATGATG 

' ■ . * • 

3840 

* *' ■ « ♦ ' ♦ • ♦ 
ACATTGAATC ATGTAATATA CACGATACAT GGTCTAATTC CTCCTCAAAC CXnAATTACC 

3900 

* *. • ♦ ■ * * * 
AATTTCGAAA CCATAATATT TACTAGTATG TTTATATATC CTTACTITAA GACATrGTTT 

3960 

* * * * * ■ ♦ ' . 
GTITATAATA CCTTGTGAAT TAAGAAAAAA AAAAAAAAAC TTGTGGATCT ATTCAAGCXyi 

4020 

* * ♦ ' ♦ * ♦ ■ 

TGTGTTAGAA TAAATTTATA AATTTTCTCC TCGTACTGGT CAGATATTGG TCCAAACTCC 

4080 

*- • * * * * 

AAAGCCTTCC CTTTTCAGGA AAAAAAACAT TTCGAAA1TA ACTCTAATTA ATCAAGAATT 

. 4140 

* ♦ ♦ • ■.* * *. 
TCCTACAATG TATACATCTA ATGTTTnTC CGCGATCTTA CTTATTAG^ TGAGGGGTAC 

4200 

* . * * * * * exon2 

AATTGAAAGG TACAAGAAAG CTTGTTCCGA TGCCGTCAAC CCTCCTTCCG TGACCGAAGC 

4260 

* * ♦ ★ • * ♦ 

TAATACTCAg]gTACCAATTT ATATTGTTTG ATTCTCTTTG TlTTATCTrC TTCTTTTCAT 

4320 

TATATATATG ATCAACAAAA AATATAACCT ACAAAAAGAG AGAGTTCAAG GAAATGCATT 

4380 

* * * . ' * I—. * * 
GAAACGGTTT CGTTATGGTG TTTGAATACA TCGATTTTTG AAG^CTATC AGCAAGAAGC 

4440 8xon 3 

* * * * * * 

CTCTAAGCTT CGGAGGCAGA TTCGAGATAT TCAGAATTCA AAT^STAAT TCATTAACTT 

4500 

* * ♦ * ♦ * 

TTCATGAACT CTTCGATTTG GTATTAGGTC ACTTAATTTO GTGTCGGTCC AAAAGTCCGC 

4560 

* * * ♦ ■ • « ^ * 
TICTAGTnT CTTTAGAAGT 'l\ : il ' rriVj ' rrr AATGTTCATG TTTACAAATT GAAQ^ATAT 

4620 exon 4 

* ' . ♦ * ' ♦ * - * 

TGTTGGGGAA TCACTTGGTT CCTTGAACTT CAAGGAACTC AAAAACCTAG AAGGACGTCT 
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* ' * . ■ * ■ * * ♦ 

TGAAAAAGGA ATCAGCCX3TG TCCX3CTCCAA AAaStTAAAA TCTACGTTGC TCTCTCTCTG 



4740 

*. * .** ♦ * 

TCTCrCTGTC TCTCTCTCTA TATATAGTCC CTTAGTTTAT ATAGTTCATC ACCCTTITCT 

^800 

* * * * ♦ ^ * - 

GAGAATTTTG CAG^TGAGC TGTTAGTGGC AGAGATAiGAG TATATGCAGA AG^qSTAAG eXOnS 

4860 

* * ★ ♦ ♦ • * . 
AACGTTTCTC CCATTCCAAG TAATTAGATC TTTCTTCGTC TTTGTGAGGG TTTGAGTnT 

4920 

* ♦ * * . * ♦ 

CCCATAAATC ATGTGTAG3A AATGGAGTTG CAACACAATA ACATGTACCT GCXSAGCAAAg] 6X00 6 



4980 

* ♦ * ♦ ♦ ♦ 

GTTAGCCACG TTCTGTTCCA AATCTTAATC TCAATATCTA ctcttttctt cattgtataa 

5040 

* * * • ♦ * * 

ctaagataac gtgaataaca agaaaacttt tgtttttggg tttaatag^ agccgaaggc 

5100 

* ♦ ♦ * ★ ' ♦ 

gccagattga atccggacca gcaggaatcg agtgtgatac aagggacgac agtttacgaa 

5160 

* * * * * * 

TCCGGTGTAT CTTCTCATGA CCAGTCGCAG CATTATAATC GGAACTATAT TCCGGTGAAC 



^^^^ Stop 
codon 



CTTCTTGAAC CGAATCAGCA ATTCTCCGGC CAAGACCAAC CTCCTCTTCA ACTTGTCjIAA 

5280 

♦ ♦ ♦ * * * 

CTCAAAACAT GATAACTTGT TTCTTCCCCT CATAACGATT AAGAGAGAGA CGAGAGAGTT 

5340 

♦ * *- " ♦ * * 
CATTTTATAT' TTATAACGCG ACTGTGTATT CATAGTTTAG GITCTAATAA TGATAATAAC 

5400 

* * * . * * ♦ 

AAAACTGTTG TTTCTTTGCT TAATTAGATC AACATTTAAA TCCAAAGTTC TAAAACACGT 

5460 

★ ♦ ♦ ♦ * . * 
CGAGATCCAA AGTTTGTCAT ACAAGATTAG ACGCATAC\C GATCAGTTAA TAGATTTTAA 

5520 

* * * * * * 

GTGCCTTTTA ATATTTACAT ATAGTTGCAG CTTCGATTAG ATCATGTCCA CCAAACACTC 

5580 
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• • ■ 



ACAATTAGAG ACAAGCAAAA CTATAAACAT TGATCATAAA ATCATTACAA CATGTCCATA 



AATTAATTAT GGATTACAAA AATAAAAACT TACAAAAGAT CT 
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Sequence Range: 1 to 6138 U/20 



10 20 30 40 50 60 

* ' * ♦ * * it 

GAATTCX5TAA CAGAATTTAG TGAATAATAT TGTAATTACC AGGCAAGGAC TCTCCAAACG 

70 80 90 100 110 120 

* * ★ * ' * * 

GATAGCTCGA ATATCGTTAT TAAAGAGTAA ATGATCCAAT ATGTAAGCCA TTGTTGATCA 

130 140 150 160 170 180 

*♦♦♦** 

TCTAACATTG TTGGACTCTC TATTCCTCGA AATGATQCAT ACCtAATCAT TTATTCASIT 

190 200 210 220 230 240 

* . ♦ * * • * • * 

AACTATCAAG TnXZATTTGT AAAAACCAAA CATTTAAATT CAGATTTGAT AlCACTTACA 

« 

250 260. 270 280 290 300 

* * . * * • * * 

GAGGATAGAG AAGCATGACT CCAGGCCTGC ATX3CAACAAG AAAAAGGAAG AAAATAATGT 

310 320 330 340 350 360 

* * * ♦ * * 

TAAAAATTTG ACAAATATAG TCTTTATTTT TATTATATGA GACAGAATTT GAATAAAATC 

370 380 390 400 410 420 

* * . * ★ « * 

CTACCCAACT AGAGCATCAA AACGTTTTGC AATCGCAATA ATGAAACXTCA TmcmTr 

430 440 450 460 470 480 

* * ' * * * . * 

GAGTTTTTAC TCITCTTTCA ACAGAAACTT TCTCAAACGT CTTTAGCACT GTGACCTTAG 

490 500 510 520 530 540 

* •* * ★ * ★ 

ATATATACAC AAAAGCTTGA AAlTrCTTCA AGCAAAAGAA TCTTTGTGGG AGTTAAQGCA 

550 560 570 .580 590 600 

* * * * ♦ • ★ 

ACAAGCCAGG TAAAGAATCT CCAACX3CATT GTTACGTTTT CATGAACCTA TTTATTATAT 

610 620 630 640 650 660 

* ♦ * ♦ * * 

GITCTAAGAA AGAAAAAAAT ATCTCAAAGT AAACGTTGGA AATTITCTGA TGAAGGGAAA 

670 680 690 700 710 720 

* ♦ * * * « 

TCCAAAGTCT TGGGTTTAGT ATCCCTATGA ATGGTATTTG GAATATGTTT TCGTCAAAAC 

730 740 750 760 770 780 

* ♦ * ♦ ♦ ♦ 

AAAAGATTCr TrTCmTTC ACAAGAGTTA GTGATCAATA ACTTATGCAC TAATTAATGA 

790 800 810 820 830 840 

****** 

GATIGGACGT ATACACAATT TGATTATGAT ACTTGAGTAA AAATCACCTO TCTTTTAATT 

850 860 870 880 890 900 

* ♦ ♦ ♦ - ♦ * 

TGGAAATCTC tCTTTCTTAC CCATTTATAT ACTACTTCTT TTCATTAAAA TTAAATTTCA 



FIG. 8A 



SUBSTITUTE SHEET (RULE 26) 



wo 99/00502 PCTAJS98/13208 

15/20 

910 920 930 940 950 960 

n ' * - * . * * * . 

• • • 

ATTATCAATC ATCGTrCAAT TTCATAAAGA TTTAACATTT TTTGTCACAG GGCTAGTAAA 

970 .980 990 1000 1010 1020 

* ' .* *#.. ♦ * ♦ 

AGCAATCTrr ACATAATTCA TCTTTCTTAC ATATATATAT TACCTTTTTC TTCATTAGTA ' 

1030 1040 1050 1060 1070 . 1080 

* . * * * * ' * ' 

TTCTATITGA TTATGATTAT TTrGTCATAA AGCTAGTAAA TTAAACACTC GATATGAGAA 

1090 1100 1110 1120 1130 1140 

* • ♦ * ★ * • . * 

TTATATTACT TCACGCTAAT TAACTCTTAA CACAACAAGA ACTAGTGCAT ATTCAACTTT 

1150 1160 1170 1180 1190 1200 

* * •* * . * * 

CAAAGCATAT ACTATATATT GAGAATATAG ACCACGAAAG TCAATCAAAA GACCTACCAG 

1210 1220 1230 1240 1250 1260 

* * * * * * 

CTCTCATCAA GTTCTTTCTT GAAATGATTT TGCAGAATTT CCAACTTAAT TAATTCGACA 

■ 

1270 1280 1290 1300 1310 1320 

* * * * * * 

TGAATGTGAA AATGTGTGTT GCTCGTTAAG AAAATTGAAT AGAAGTACAA TGAAAATGAT 

1330 1340 1350 1360 1370 1380 

* ♦ ♦ * ' * ' * . 

GAGGAATGGG CAAAACACAA AAGAGTTTCC TTTCGTAACT ACAATTAATT AATGCAAATC 

1390 1400 1410 1420 1430 1440 

* * •* * * * 

TGAGAAAGGG TTGATGGATA ATGACTACAC ACATGATTAG TCATTCCCCG TGGGCTCTCT 

1450 1460 1470 1480 .1490 1500 

. ' " .* ♦ * ★ * 

GCnTCATTT ACTTTATTAG TTTGATCTTC TCTAATTATA TTGTCGCATA TATGATGCAG 
1510 1520 1530 1540 1550 1560 

* * * *' A . * 

TTCTTTTGTC TAAATTACGT AATATGATGT AATTAATTAT CAAAATAAAT ATTCAAATTG 

1570 1580 1590 1600 1610 1620 

* ♦ ★ * * * . 

CCXSTTGGACT AACCTAATGT CCAAGATTAA GACTIGAACA TAAGAATTTT GGAAAAACTA , 

1630 1640 1650 1660 1670 1680 

* . ♦ * . ★ ♦ * 

AACXAGTTAT AATATATACT CTTAAATTGC CATTTCTGAA CAGAACCAAA TAATAATATA 

1690 1700 1710 1720 1730 1740 

* ♦ * * * ♦ 

TACTATTTAC AGTTTmTT AATTGGCAAG AACACTGAAA TCTTATTCAT TGTCTCGCTT 

, 1750 1760 1770 1780 . 1790 1800 

* ♦ ♦ ■ * * ' * 

GGTAGTTGAC AAGTTATAAC ACTCATATTC ATATAACCCC ATTCTAACGT TGACGACXjAA 

1810 1820 1830 1840 1850 1860 

* * ' . * * * * 
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CACTCATATA AACCACCCAA ATTCTTAGCA TATTAGCTAA ATATTOGTTT AATTOGAAAT 

1870 1880 1890 1900 1910 1920 

' * *' . . * * * * 

ATTTTTTTTA TATATAAAAT GCCAGGTAAA TATTAACGAC ATGCAATGTA TATAQGAGTA 

1930 .194.0 1950 1960 ' 1970 1980 

. * • * . - ♦ * * ' * 

GGGCAATAAA AAGAAAAGGA GAATAAAAAG GGATTACCAA AAAAGGAAAG TTTCCAAAAG 

1990 2000 2010 2020 2030 2040 

*■ * . * * * 

GTGATTCTGA TGAGAAACAG AGCXCATACC TClCTTrrTT CCTCTAAACA TCAAAGAAAA 

2050 2060 2070 2080 2090 2100 

* * ★ * * ' ♦ 

ATTGGATGGT CCTCCTTCAA TGCTCTCTCC CCACXTCAATC CAAACCCAAC WlCTiUlTr 

2110 2120 2130 2140 2150 2160 

* ♦ * . * -* * 

CTTTCTTnT TCTTCTTTCT AATTTGATAT TITCTACCAC TTAATTCCAA TCAATTICAA 

2170 ^ 2180 2190 2200 2210 2220 

* ' ♦ ♦ * * # • 

ATTTCAATCT AAATGTATGC ATATAGAATT TAATTAAAAG AATTAGGTGT GTGATATTTG 

2230 . 2240 2250 2260 2270 2280 

* ■ ♦ * ■ * . * . *■ 

AGAAAATGTT AGAAGTAATG QTCCATGTTC TriCTTTCTT TTTCCTTCTA TAACACTTCA 

2290 2300 2310 2320 2330 2340 

* * * * * ♦ 

GTTTGAAAAA AAACTACCAA ACCITCTGTT TTCTGCAAAT GGGTnTTAA ATACTTCCAA 

2350 2360 2370 2380 2390 2400 

' • * * * * ♦ * 

AGAAATATTC CTCTAAAAGA AATTATAAAC CAAAACAGAA ACCAAAAACA AAAAATAAAG 

2410 2420 2430 2440 2450 2460 

'* * * * ■♦' 

TTGAAGCAGC AGTTAAGTGG TACTGAGATA ATAAGAATAG TATCTTTAGG CCAATGAACA 
2470 2480 2490 2500 2510 2520 

* r" * * ■ * * * * 

AATTAACICT CKMAATTC ATCTTCCCAT CCTCACTTCT CTTTCTTTCT GATATAATTA 

2530 2540 2550 2560 2570 2580 ^ 

* — * * * * ♦ 

ATCTTCCTAA GCXM3TATG GTTATTGATG ATTTACACTT TTTTTTAAAA GITrCTTCCT 



2590 2600 2610 2620 2630 2640 

* ♦ * * * * 

TTTCTCCAAT CAAATTCTTC AGTrAATCCT TATAAACCAT TTCTTTAATC CAAGGTGTTT 



2650 2660 2670 2680 2690 2700 

* * * * * * 

GAGTGCAAAA GGATTTGATC TATITCTCTT GTGTTTATAC TTCAGCTAGG G^ATAGAA 

translation,-,-. 

start ^^^^ ^^^^ 2740 2750 2760 



exon2 



ATG3AGGGTG GTGCGAGTAA TGAAGTAGCA GAGAGCAGCA AGAAGATAGG GAGAGGGAAG 
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2770 2780 2790 2800 2810 2820 

* ★ * * * ' * 

ATAGAGATAA AGAGGATAGA GAACACTACG AATCGTCAAG TCACTTTCTQ CAAACGACGC 

2830 2840 2850 2860 2870 2880 

* - ♦ * * * * 

AATQGTTTAC TCAAGAAAGC TTATGAGCTC TCTGTCTTGT GTGACGCTGA GGTTCCrCTT 

2890 2900 2910 2920 2930 . 2940 

* * * ■ * * * 



GTCATCTTCT CCACTCGAGG CCGTCTCTAC GAGTACGCCA ACAAC^STA CACATCITTT 

2950 2960 2970 2980 2990 . 3000 

* ♦ ♦ ♦ * . ♦ 

AGCTAGATCT TGATTTTGTT GAATTTTTTT TCTAGAATAA AGTTrCGACT CTTCTGGTGG 

3010 3020 3030 3040 3050 3060 

* * . * * • * " ' * 

GTTTTTCAAT CTTTATGGTC TCTTTATAGT Tri ' milLTi ' AGTITCTCTG AAGCTCAAAT 

3070 3080 3090 3100 3110 3120 

* ,* ♦ * ♦ ■ * 

CTCTTTAAAA ATCCCCAAAA TTAGGGTTrG TTTAAAACTA GGGAACCCTA CTTrAACTTC 

3130 3140 3150 3160 3170 3180 

* * ♦ ♦ ★ * 

TTTCTCTTAG TAAAAAAGCA GTGAGGGTCT TCTCTGATCA TTAATTAGCA TCCCCCATAC 

3190 3200 3210 3220 3230 , 3240 

- * * * * * ★ 

CTTGTTCCAG TCACTTTTTC TCCACAAATC CTTATAACAG TATCTATATA TGTATCTATT 

3250 3260 3270 3280 3290 3300 

* * * * * * * 

TATGTCAGIT TGTACAAGAC ACTTCGATCA ATTTGATGAC CCATCAAGTT TTATTTCTGC 
3310 3320 3330 3340 3350 3360 

* ♦ • * .* * ^e 

AGATTGATCA TTAGGITTCC ATCATAGTAA TGAAAAAGTA GGGTTCITGA TAAAATTATA 

3370 3380 3390 3400 3410 3420 

* * ♦ * • * . * 

ATAATATATA TTATTTGGCT ATATAAAAAA GCTATGTAGA TTCCTTAAAA ATTGATTCAC 

3430 3440 3450 3460 3470 3480 

* * ♦ * * * 

TAGGGAGAGA CTAGTAGGTG TTTCTCTTCT GACACTTCTC TAATCTTTTG GTGAATCCTT 

3490 3500 3510 3520 3530 3540 

*■*■*■*■* * 

TTGTTAAATC AAGAAAATGA ATCAGGGACA AAGCTTATTG TTGAGTCACT TAATTAATCA 

3550 3560 3570 3580 3590 3600 

* ♦ * ♦ ♦ • ♦ 

TCCGATCCAT CAATCAAGAA AAATAACGAA ACAGAAAATT TTGATTTTTG ATTGTTATTT 

3610 3620 3630 3640 3650 3660 

* * * * ★ * 

TCTCCACTTC AAGTTGGGGA CTTGTCATTT CCGTTTTTCT ATACGTTTCC AGCTATTAAC 

3670 3680 3690 3700 3710 3720 

* * ♦ ♦ * . * 
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AGCTOmSTT CATTTCACCA TTTTOATTAT TI V lCmCTT TTTAAAGATA AA'lUl'iTlXA 

3730 3740 3750 3760 3770 3780 

-* * . . * # * * 

: AAAATATTGT TTTTATTTGC TTCGCTAGTT AATACTATAA TTGAGGTTGA TGTATGACTA 

3790 3800 3810 3820 3830 3840 

* * ♦ ♦ ♦ ♦ 

TAATCTATAA GTCAAGTCTC ATATCATGGA TCTAAGTTAA AACTAGTAAA TTT6TAGTTT 

3850 . 3860 3870 3880 3890 3900 

* « * * * * 

CAAT6TGAAC TTTCACAACXS ACTAAAGAAC TGATCTGAAG OTTATAATGG ACATGACTAA 

■ - • 

3910 3920 3930 3940 3950 3960 

* * it * * * 

TTTGATTAAC AAAAGAGGAA TGGATTATGT ATGTAGAAAC ATGTTCATATA TATAICTTTC 

3970 3980 3990 4000 4010 4020 

* . * * - *. * *. 

TATTATCAAA AGTGTAGTTA ACTTTCTTAT TTCAAACACC CTCATGCTIT AGTAGTATCT 

. 4030 4040 4050 4060 4070 4080 

* ♦ * ★ ★ ♦ 

TACrmUAC ATTTCTCAAC TTCAGClTrC CATTATACAA CAGCACAATG TAAATTACIT 

4090 4100 4110 4120 4130 4140 

* * * ■ * * * ■ 

GTATATiSAAT ATGAAAGCAT AACGTTATGC AAAGATTTCT AQCTITrCTT TTTCTGITIT 

I 

4150 4160 4170 4180 4190 4200 

* ♦ . ♦ ■ ♦ * ' ♦ 

GCAAAAGATT TACAAATATC ATGTTCTTGG TAAAAACATA CTlXXrCICAG . CCACATATGC 

4210 4220 4230 4240 4250 4260 

* * ♦ « * ♦ 

ATGTAAATGT AATGTTCAAA TAITAATTCA GGAAAAACAA AGAAGAAGCA AAATTAGCTT . 

4270 4280 4290 4300 4310 4320 

* * * * * ' # 

CTAGAGTAGG GAATCTATTG ACTTGACCTG AAAATCACTT CTnTTCTTA AAGCCTAGXA 

4330 4340 4350 4360 4370 4380 

* * * . * * * 

GTGAATTTTT TAAOCTAATT AQGCCAAAAT ATATACTAGC CTAAAATATA ATTTGCATTT 

4390 4400 4410 4420 4430 4440 

* . ♦ * ★ ♦ ♦ 

TGTGTCGTAC ATAAATTGGG ACCAATTCCA ATTAACTAAG AGCATATGCA ATTCAAATIC 

4450 4460 4470 4480 4490 4500 

* * * * * <* 

TTtTTATTTT CTTCTCCXjAT TTGCTACTTC TnCTnTGr ATGTTTTCAA AITAGGATTA 

4510 4520 4530 4540 4550 4560 

* ♦ * * * * 

CACTTTTTTG GGGAAGTACA CATTAGQGTC TTCTCGAACT TTGATTATAC ATATATATAT 

4570 4580 4590 4600 4610 4620 

* ■ • * ♦ * * 

ATATATATAT ATATAACTTT GTGAGATC3TC ACTOTTAATA GATAATAGGC AATAACAATA 

FIG. 8E 
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4630 4640 4650 4660 4670 4680 

* * * ■ ♦ * * 

ATATCCAAAA AAGAAGGCGC AAACAAATCA TATACTATAT GGTACTGGTC CATTCACTAT 

4690 4700 4710 4720 4730 4740 

• * * * ■ * ★ . * 

TTTGTCGCnT GAATTTAAGG TTTGGCGTAC AAACTTICTT TCAAACCTTT ATTATTCCGT 

4750 4760 4770 4780 4790 • 4800 

* * * . * * ♦ 

CTITCTGTC3T GTnTGTATA TCCAGAAGAT AAAAATATCA ATTTCTITAA CGACITCATA 

4810 4820 4830 4840 4850 4860 

* * ♦ * ' ★ ' ♦ 

TATATATATA TATATATATA TATATATATT 'ITlViiri'li: ! ' GGTTTTAGTG TTTGAATCCA 

4870 4880 4890 4900 4910 4920 

* * ♦ ♦ . *■ ♦ • 

ACAGTTATAG TTrCGTGTXST ClTiUTm'A CTTGTGGTGG ITTAAGrrTG AGATTTTCAC 

4930 4940 4950 4960 4970 4980 

* ♦ , * * . * ■ * 
CGATTGCATC TATTTACATA TATAGCTACC ACAAAAAAGA TTGCATITrA AAAOXmnC ' 

4990 5000 5010 5020 5030 5040 

* * I" * * * * 

CTTTGTGTGA ATGTTGATGA AGttCTGAGAG GAACAATAGA AAGGTACAAG AAAGCTTOCT 

5050 5060 5070 5080 5090 5100 ^ 

* . ♦ * * ^ * ' * 

CXXACX3CCGT TAACCCTCCG ACCATCACCG AAGCTAATAC TCAqSTTAGC OTITAATrAA 

5li0 5120 5130 5140 5150 5160 

* * * * ★ * ■ 

TACACCTAGC TAGCTAGTTC GTrAATTACT TAATTTCTTC TTCmTAGT TATCTCACCT 

5170 5180 5190 5200 5210 5220 

* * * * * * 

TTITITCACC TCTTGTAACA ATGATQGGAT CGAAATTGAT GAAC3&CTAT CAGCAAGAGG 

5230 5240 5250 5260 . 5270 5280 

* * * * . ★ * 

CXSTCTAAACr CCGGAGACAG ATTCGGGACA OTCAGAATTT GAACAGACAC ATTCTTCGTC 

5290 5300 5310 5320 5330 5340 exon4 

* ♦ * . * . * *. • 

AATCTCTTGG TTCCTTGAAC TTTAAGGAAC TCAAGAACCT TCAAAGTAGG CITCAGAAAG 

5350 5360 5370 5380 5390 5400 

* ♦ —I * * . * * 



GAATCAGTCG TGTCCGATCC AAGA^STAC ATCACTAACT CTCCATCAAT CTCCTTATCA 

5410 5420 5430 5440 5450 5460 

* * . * * * • • * . 
TTGAATATAT ATCCATCTGA TTCTTGCCCG TTATATTTCG ' I ' n T l tJ'l Cii; CAG^CGAGA 

5470 5480 5490 5500 5510 5520 exon 5 

♦ * * ♦ , * * ★ 

TGTTAGTTGC AGAGATTGAA TACATGCAAA AAAGG 3TAAA AGTAAAACCT ATC T ICCTIC 

5530 5540 .5550 5560 5570 5580 

*♦*♦♦♦ 
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ACAATCSAACT ACCCCrTACTT TATTAGCAAC TTCTCITICr GATGATCATC rrmTATiT 

5590 5600 5610 5620 5630 ' 5640 

TCKnTCTCG CrrGCATTGT ACS^AATCGA GCTGCAAAAC GATAACATGT ATCTCCbCTC 

5650 5660 5670 5680 5690 5700 exon 6 

* . * * * *★ ' * 



CA^ 



?nTTA TACATAACTC nTTTGGCAT TTTTGATCAT CATTTTnTC CGGTAGACAA 



5710 



5720 
* 



5730 



5740 



5750 



5760 
* 



TCTCTTGATG TOCAAATTCT AAATATCTCT GCAG^TACT GAAAGAACAG GTCTACAGCA 
5770 5780 5790 5800 5810 5820 

ACAAGAATCG AGTGTGATAC ATCAAGGGAC AGTTTACXaAG TCGQGTGTTA CTTCOTCTCA exon 7 



5830 
* 



5840 



5850 



5860 
* 



5870 



5880 
* 



CCAGTCGGGG CAGTATAACC GGAATTATAT TGCGiG?lTAAC CTICTTGAAC CGAATCAGAA 



5890 
* 



5900 



5910 



5920 

Stop 



5930 



5940 



TTCCTCCAAC CAAGACXIAAC CACCTCTGCA ACTTGTT 



PGA 



TTCAGTCTAA CAT/AGCTTC 



5950 



5960 



5970 



5980 
* 



5990 



6000 

4t 



TTTCCTCAGC CTGAGATCiSA TCTATAGTGT CACCTAAATG CGGCCX3CGTC CCTCAACATC 



6010 



6020 
* 



6030 
* 



6040 
* 



6050 



6060 

* 



TAGTCGCAAG CTGAGGGGAA CCACTAGTGT CATACX5AACC TCCAAGAGAC GGTTACACAA 



6070 



6080 



6090 



6100 
* 



6110 



6120 



ACGGGTACAT TGTTGATGTC ATGTATGACA ATCGCCCAAS TAAGTATCCA GCTGTGTTCA 



6130 
* 

GAACGTACX?r CCGAATTC 
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